Hacker News Comments on
"Tree-sitter - a new parsing system for programming tools" by Max Brunsfeld
Strange Loop Conference
·
Youtube
·
131
HN points
·
4
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this video.You can watch a good Strangeloop presentation on Tree Sitter. https://www.youtube.com/watch?v=Jes3bD6P0To
Parsing (use of rather than theory) matters as it affects my work. So I followed up.See https://youtu.be/Jes3bD6P0To
Tree sitter is based on LR parsing (see 23:30 in above video) extended to GLR parsing (see 38:30).
I've had enough of fools on HN posting unverified crap to make themselves feel cool and knowledgeable (and don't kid yourself that you helped me find the right answer by posting the wrong one). Time I withdrew. Goodbye HN.
⬐ IshKebabI'm not sure what you think you're refuting but Tree Sitter definitely does some different stuff to allow recoverable parsing.
It's great to see more tools adopting tree-sitter [1].Having a (fast) single tool that can accurately parse most commonly used programming languages is incredibly useful, but it requires the maintenance of dozens of grammars, which is difficult without a large community effort. Hopefully increased adoption means more accurate parsers and support for even more languages.
Tree-sitter powers syntax highlighting on GitHub.com and (soon) neovim and OniVim 2. Hopefully regex-based syntax highlighting is a thing of the past soon. If you haven't seen the Strange Loop conference talk on tree-sitter [2] yet, it's worth a watch.
I think a Prettier-like code formatter using tree-sitter would be cool, both in terms of potentially broader language support and native performance.
⬐ minxomatImportant recent development in tree sitter was the new query language. Like TextMate or Sublime Grammars, ts in atom did use CSS selectors, but now it has a much more powerful s-expression query language which is useful for more than just syntax highlighting, e.g. static analysis. An application of that is Github's semantic, a haskell tool for code navigation and call graph analysis.Demo and explanation: https://github.com/tree-sitter/tree-sitter/pull/444
⬐ adadgarNeovim is aiming to integrate this in the next major release, v0.5: https://github.com/neovim/neovim/pull/11113⬐ lewisl9029I've been following tree sitter for a while, as I find the tech super cool and can't wait to see more practical applications.One thing (among many others) that I've found really promising about Dark is its editor. See the hands-on video on their homepage for a demo: https://darklang.com/
It mostly feels like you're just typing text like in any regular text editor, but your inputs are actually manipulating the AST directly, and the editor itself ensures that your inputs can never result in an invalid program (i.e. there's no such thing as making a syntax error in Dark). It's inspired by tooling in the lisp world like Paredit and Parinfer, but Dark itself doesn't have to _look_ like a lisp because the structure of the AST is maintained by the editor itself instead of by users manually inserting and removing parens. It's an ingenious way to get most of the productivity benefits of a lisp-style syntax and all the structural editing tooling that comes with it, without intimidating new-comers with the super foreign looking parens infested syntax lisps are infamous for.
The other day I was actually briefly looking into whether or not it could be possible to replicate something like this in Atom using tree-sitter for some mainstream language like JS, but ended up getting blocked by the fact that Atom doesn't seem to offer an API for plugins to block/replace user input. This is probably for the best, given all the horrible ways this could be abused, but it does mean if I wanted to explore the idea further I'd probably have to either fork Atom to experiment with the idea or build something up from scratch, which is a pretty daunting undertaking given how deceptively complex modern editors can get these days.
But maybe I'm missing a different way to accomplish this in Atom with its existing APIs? Or does anyone know if VSCode's extension APIs can support this use case? I realize I've probably barely scratched the surface given how little time I've spent on it so far.
⬐ leeoniya⬐ dmortin> The other day I was actually briefly looking into whether or not it could be possible to replicate something like this in Atom using tree-sitter for some mainstream language like JS,already being done as part of CodeMirror v6:
⬐ minxomatI really don't think it's inspired by Parinfer. It's likely based on the theory of structural editing and AST projections first popularized by JetBrains' CEO and available for experimentation in the open source project MPS. An end to end application of this theory is commonly referred to as a language workbench.Papers: https://confluence.jetbrains.com/display/MPS/MPS+publication...
Language workbenches: https://www.martinfowler.com/articles/languageWorkbench.html
Nice intro to structural editing:https://medium.com/@mikhail.barash.mikbar/looking-at-code-th... (also mentions scratch)
⬐ carapace> It mostly feels like you're just typing text like in any regular text editor, but your inputs are actually manipulating the AST directly, and the editor itself ensures that your inputs can never result in an invalid program (i.e. there's no such thing as making a syntax error in Dark).The basic idea has been around for a while.
Here's something from the 80's: Alice Pascal https://www.templetons.com/brad/alice.html
> One of the first projects I did after forming Looking Glass Software Limited was a syntax-directed programming environment called Alice: The Personal Pascal.
> Syntax-directed editors are somewhat controversial, however I think they are quite good for people learning programming, and Alice was written first to be used in education in the school systems of Ontario. Our first sale was a contract to develop it for the Ministry of Education there.
Will tree sitter also stimulate creation of free tools which work on the AST?E.g. it's a mystery to me why we don't have free refactoring tools like the ones in IntelliJ. Like some free library which could extract methods, rename variables, etc. by modyfing the AST. It does not seem too hard.
Is it because the current AST parsers are not fast enough or is there some other reason?
⬐ adamsmith⬐ dmitriidYou need semantic understanding to do several of those operations. Parsing often isn’t sufficient.⬐ dmortin⬐ lioetersYes, but semantic understanding is not really complicated for rename variable, for example, so it's strange there is no library which can do that.From my limited knowledge/experience, the use of language server protocol (like in VS Code editor) enables refactoring operations like you describe, for example, in TypeScript it can create a struct out of function parameters, or create a class from old function-prototype based definitions. Compared to IDEs like IntelliJ, though, I imagine the feature set is much, much smaller in scope.I did see some discussion about integrating tree-sitter with VS Code, but the focus seems limited to syntax highlighting, not operating on ASTs.
⬐ lioetersI found that the last time this talk was posted on HN [0], the author of tree-sitter mentioned that a couple of language servers are indeed using tree-sitter.* Bash - https://github.com/mads-hartmann/bash-language-server
* Ruby - https://github.com/rubyide/vscode-ruby/tree/master/server
So... You write your grammars in Javascript. Which is then serialized to JSON but a parser defined in Rust, so that it can be compiled to C?..That’s... a very roundabout way of doing things.
⬐ xvilka⬐ rrampageI asked[1] recently if it's possible to remove the need of the whole NodeJS. The conclusion is that it might be possible to use duktape instead.⬐ maxbrunsfeldMany parser generation tools use their own custom grammar language, and then generate a C parser based on that. With Tree-sitter, it’s a similar setup, except the grammars are written in JavaScript instead of some custom language.The parser generator itself is all written in Rust, but the end user doesn’t need to use rust in any way.
The project page is at https://tree-sitter.github.io/tree-sitter/⬐ dangDiscussed at the time: https://news.ycombinator.com/item?id=18213022⬐ based2(2018)⬐ ggurgone⬐ georgewfraser(it is the title of the talk)⬐ saagarjhaDates are usually added to posts that aren't recent.The most obvious application of tree-sitter is editors. I wrote a VSCode extension to replace the built-in syntax coloring with tree-sitter-based coloring: https://marketplace.visualstudio.com/items?itemName=georgewf...I actually think it would make more sense for the various VSCode language extensions to just bake in tree-sitter for their language. I have had a PR open to do this with golang for a while: https://github.com/microsoft/vscode-go/pull/2555
⬐ ahelwerCan you use tree-sitter for things that are more complicated than syntax highlighting, such as reference finding? I've been wanting to write a language server for a while but have been put off by the complexity of gracefully handling sections with incorrect syntax (while the user is typing, for example).⬐ dmortinWhat is the point of replacing the builtin syntax coloring? Is it faster or does it color more things?⬐ Mathnerd314Depending on the grammar I think it's a little slower than the regex-based TextMate coloring. But the overhead is mostly due to the VSCode plugin architecture.⬐ georgewfraserIt colors more accurately.⬐ dunkelheitBuiltin syntax highlighting for e.g. rust is laughably bad - the treesitter highlighting is much better. Side note: I've recently switched to vscode as my main editor and so far the experience has been full of contrasts - many advanced features such as remote editing are the real gamechangers and work flawlessly, but some basic features (the aforementioned highlighting, folding, basic git integration) are notably lacking in polish. You kind of expect that if they've gotten advanced stuff right then basic stuff is surely in order, but that is not the case.⬐ AnthonBergHave you tried Jetbrains IntelliJ? In my experience the IntelliJ platform is, well, if you look in the direction VS Code is pointing, there you'll find IntelliJ?Tangentially related, there's some tree-sitter activity in the Jetbrains org on Github: https://github.com/JetBrains?utf8=&q=tree-sitter&type=&langu...
which is cool
⬐ dunkelheitI've used intellij a little bit and it is awesome (albeit a bit slow for my taste). The reason I stick to vscode is remote editing - compiling rust code locally on my laptop is a torture compared to compiling it on a beefy remote box! Remote editing in vscode is very well done, even most extensions work flawlessly without any changes. As I understand, there is nothing comparable for intellij.⬐ AnthonBergInteresting!, thanks!
For a really nice solution to the error message problem, see this recent strangeloop talk: https://www.youtube.com/watch?v=Jes3bD6P0ToBasically it uses the parse tree disambiguation from the GLR parser to look for the most likely mistake the user made - it's very clever.