472 subscribers
6 photos
1 video
2 files
550 links
python, go, code quality, security, magic

Website and RSS:
https://itgram.orsinium.dev

Source:
https://github.com/orsinium/itgram

Author:
@orsinium
https://orsinium.dev/
Download Telegram
If you want to learn Rust, the official website provides you 3 options: book, course, and examples. I started with the examples, and they are very bad. The first example is a classic Hello World but instead of just showing how to print text and move on, the resource goes into a lengthy explanation of what traits you need to inherit in your custom classes to make them printable. Without even explaining what is trait. And the same trend goes through all examples: dive into some details and use a bunch of terms without explaining them or even linking. I'm still not sure who's the target audience.

So, if you want to learn Rust, read the book, the book is pretty good.

The language itself is pretty strange too. Most modern languages realized that code is for humans, and the central goal of syntax and most of the high-level features should be readability and the principle of least surprise. And then there is Rust which consists of some obscure symbols, implicit behaviors, and strange decisions.

Rust requires each line of code to end with a semicolon. This is a strange decision for a modern language on its own. Most modern languages realized that you can tokenize code without it, and explicit semicolons are just visual noise.

But the best example of strange decisions is how to return a result from a function in Rust. In Python and Go, you always have to write an explicit return statement. In Elixir, Haskell, and Ruby function implicitly return the last expression. So, if you don't want to return anything, you make sure that the last expression is nil, which is, again, explicit.

Now, rust solves it in a very unique way. Instead of having an explicit return statement or nil at the end, it decides whether to return the expression or not based on if there is a semicolon at the end. If the line doesn't end with a semicolon, the result is returned. If it has a semicolon, a void is returned. Now, imagine you're reading some Rust code without going through the whole Rust book. What are the odds you'll figure it out? Some people will point out that the compiler will fail with a detailed error message if the semicolon and type annotation don’t match, but I’m talking about the experience of reading the code, not writing it.

At some point, people stopped using concise but hard-to-read languages, like Perl or APL. Well, there are still there but definitely not mainstream. Also, languages started to get rid of visual noise. You'll hardly find semicolons in most of them. I can't recall modern languages (except for PHP) that require a special symbol each time you need a variable. Many strictly typed languages, like Go, have some basic type inference. And here we are again, Rust gets into the mainstream with compiler-centric syntax.
πŸŽ₯ Seven Ineffective Coding Habits of Many Programmers by Kevlin Henney (text summary). I'm surprised I haven't published it before. A great conference talk about how humans read code, and so how to make your code a bit easier to read and understand.

This talk is what inspired me to create flake8-length that keeps line length short without compromising readability of messages, and flake8-comments that tries to catch useless comments.
🐍 I've made svg.orsinium.dev. It's an online playground for svg.py, and it has quite a potential. You can generate images in your browser. It's like processing but it's Python! Unfortunately, I there is no autocomplete in it (yet?), and a nice autocomplete is the whole point of svg.py. Still, quite useful for debugging your art.

The svg.py turned out to be more useful for me than I expected, TBH. I've created it to draw cards for my board game, and I still didn't get to it, shame on me. But despite of this, I already have used svg.py in punchline for producing punch stripes for my music box, and in generative-art which has only 4 images so far, but I hope it to grow.
πŸ”§ netlify is quite cool. It's a hosting for static websites with a lot of nice features, like custom domains, automatic integration with Github (provides CI and previews), automatically renewed let's encrypt certificates, configuration-as-code. I use it for all my open-source websites (wps.orsinium.dev, orsinium.dev, svg.orsinium.dev, gweb.orsinium.dev, you got the idea), and I'm quite far from running out of quotas they provide for free accounts.

I avoided for quite a long time all static website hostings because I thought they only can serve whatever HTML you already have in the repo. No, netlify can also build the pages. For each build, it spawns an Ubuntu Docker image with a lot of stuff already installed (like Python and Go) and calls whatever custom command you've provided (which can be a shell script if you have a complex build logic). For example, in svg-playground, the netlify.toml tells to run netlify.sh which installs and runs task. And task executes Taskfile.yaml which does quite a few things, producing at the end all HTML pages and WASM binary. That could be fewer files if you want but I felt like being more verbose there.
Registration for Hacktoberfest 2022 is open. Hacktoberfest is yearly event when you get some swag (T-shirt+stickers, or donation for planting a tree) for submitting 4 PRs to open-source projects during October.

Hacktoberfest 2020 had a lot of spam, so this year the initiative got some improvements, even more than in 2021:
1. Maintainers also will receive swag for reviewing and triaging Hacktoberfest PRs.
2. If 2 of your PRs are marked as spam, you automatically blocked.
3. You have to press 4 different checkboxes that you acknowledge the previous point.
SimpleParsing is a little Python library that "does one job and does it well".

If you, like me, want everything in your Python code to be type annotated (for the sake of autocomplete, semantic syntax highlighting, and safety) you may know that the container that argparse returns (argparse.Namespace) isn't type checked because mypy can't statically know what flags and of what type you registered in ArgumentParser. And if you want to make it type-safe, there is quite a bit of boilerplate: define flags, define type safe container, unpack flags into the container. And if the definition and the container mismtch, well, your code is wrong.

SimpleParsing solves exactly this problem. You define a dataclass class, annotate attribute types, set defaults, add comments to attributes, and then SimpleParsing will turn it into CLI args. Attribute names will form names of the CLI flags, types will be checked, defaults will be respected, and comments will be turned into help messages.

I've seen quite a few alternatives for that task, and all of them work on top of pydantic, click, or attrs. And I really like that SimpleParsing works with what we already have in stdlib, without bringing unnecessary dependencies.

And don't listen to anyone, click sucks. Testing it is quite hard, functions with 20 arguments and 20 decorators on top aren't nice, and passing all that stuff deeper into the code is hard and verbose. Using click will encourage dirty and verbose code and bad practice. Oh, and IMHO the CLI it produces is worse than that of argparse.

#python
Why sprint estimation has broken Agile. The only bad thing about this article is that it's on Medium. The rest is great. I was even going to write my own article on the subject, but someone already did it much better.

We've lost the meaning of story points along the way. Bob Martin (one of the people who signed the famous Agile Manifesto) in his talk on Clean Code covers why they introduced the idea of story points. You're starting a new big project. The business want to do a lot of things but the time is limited and not all of the features are needed for MVC. Would be great to estimate how long each task will take (before we even have the team), so we can better plan and prioritize. So, how to approach it? Uncle Bob suggest to write all features on cards (for example, "implement user registration"), pick one card, and put a number on it. Let's say, 5. Why 5? IDK, just a good number. And then estimate all other tasks compared to this one. Implementing log-in is faster than registration? Put 3 on it. The PoC of news feed is harder? Put 8. That's it.

This is why story points aren't hours. This is why they don't matter for not deadlined project. This is why the article is good.
I've stumbled across the work of Ray Toal, and there is a lot to explore. I wish I had lectures like this in my university. Algorithms, data structures, databases, programming languages, security, networking, and much more.

I especially like Computer Language Semantics. He covers what is syntax, syntax design, and has good introductions to many languages with lots of examples (for example, to Clojure). Quite a good way to get started with a new language when you already know some programming. He also published a book Programming Language Explorations which, I think, is based on these courses. He published a ple GitHub repo for the book with examples of programs on each language he covers (and that's a lot of languages). Also, the book landing page links a few similar resources, like Syntax Across Languages, a big page that compares how the syntax for different things (like function call) looks like in different languages. And he also provides this overview page of different languages, with a short description, application examples, links, and tags.

But there is more cool stuff, I can't reasonably cover it in a single post. For example, this guide on clean code. Or intro to command line. As I said, a lot to uncover. I recommend you to go right to his home page and find yourself what resonates with you:
https://cs.lmu.edu/~ray/
πŸƒ go-recipes is a collection of Go snippets and tools to make different cool little things in Go: draw a dependency graph, prettify test coverage report, parallelize tests, show Go assembly, etc.

#golang
Aging programmer is a short list of random thought by a 40 y.o. engineer on what has changed for him over the years and what stayed the same.
πŸ”§ onefetch is a little tool to show some basic stats about a git repo: authors (and percent of code contributed), commits, lines of code, languages, size. And that's it, really. Not much, but it's quite easy to install and quick to run (Rust at its best), unlike some other alternatives I tried before.

#cli
πŸ”§ d2 is a CLI tool (written on Go) for generating diagrams from a text definition of what is related to what and how. If you know what is PlantUML, MermaidJS, or GraphViz, it's the same, but a bit friendlier. For example:

block1 -> block2: text on arrow
block2 -> block3
block4

If you want more examples, the authors maintain a text-to-diagram.com website that compares D2 to similar tools.

I tried it for a few projects, and that's what I'm going to use from now on for all diagrams. It's easy to install, has a nice syntax, and gives quite a nice output. There is no way to apply a custom style to group of elements (like class in CSS), but other than that it's fine. When you need custom colors or something, you still can do it on per-element basis.

#cli
πŸŽ₯ Make Illegal States Unrepresentable is a beginner-friendly talk about writing good type annotations. The speaker covers what is type safety and illegal states, why it matters, and shows three techniques how to achieve that. Examples are on Scala, but can be translated to Python, Rust, or TypeScript. Not to Go, sorry, no algebraic types and atoms for you.
Go 1.20 has been released! I'm most excited about the addition of profile-guided optimization (PGO). Now you can run on production a copy of your service with the profiler enabled, redirect there a small percent of users, export the collected profile, and recompile the service with -pgo=/path/to/profile. The compiler will use the information from the profile about which lines of code are executed more often and optimize the binary for them. More specifically, Go 1.20 compiler will do more aggressive inline for the hot parts, which adds 2-4% of performance on average. That's not much but it might be much more in the future versions of the compiler.

I always liked the idea of PGO. There were many blog posts about people moving around parts of code (and also many PRs into Go itself doing the same for stdlib) to get a better performance for the most often executed path. And it always bothered me as one of the things that the compiler should do instead of humans. Well, now it can!

More in the Go documentation: Profile-guided optimization.

#golang
A week ago, SQLAlchemy 2.0.0 was released. Now, the default way to describe ORM models is using declarative type-annotations based definitions. So, the fields instead of id = Column(Integer) can be described as just id: int. In some cases, it can get more verbose, but it pays back by better IDE integration, syntax highlighting, type checking, and other cool stuff that comes with type annotations.

I wanted to try different alternative ORMs with asyncio and typing support, but never got to it. Now I think that it gets quite hard to beat SQLAlchemy. The project, despite being very old, still keeps up the pace (what I can't say about Django ORM, Pony ORM, or Peewee) and has very good support for modern practice. Namely, for asyncio and type annotations.

Anyway, there are some asyncio-powered ORMs that I haven't tried but that look interesting:

+ sqlmodel is a thin wrapper on top of pydantic and sqlalchemy from the author of FastAPI. It's not actively maintained but there is not that much of code for it to be a big problem. This is the most popular ORM on this list because the author is famous.
+ ormar is another wrapper on top of pydantic and sqlalchemy to consider. Don't get deceived by the number of commits, though, they all are from dependabot.
+ tortoise-orm is an asyncio ORM inspired by Django ORM. At this point, I'm not sure anymore that it's a good idea. A long time ago, I used to like Django ORM for its similicity, but now I'm more skeptical about this simplicity as I learned how much of poor performance and testability it costs. Internally, it uses pypika for building the queries.
+ piccolo has quite a nice query builder but model definitions aren't declarative. Also, they say its "fully type annotated" but that's not what you might expect. There is Any all around, and no type safety at all in what the queries return.

#python
Hosting SQLite databases on Github Pages is a blog post about how to run SQLite on the frontend. SQLite gets compiled into WebAssembly and the author implemented a virtual FS (sql.js-httpvfs) that serves the database in chunks instead of downloading the whole database at once. Apparently, for it all to work well, you need either a small DB or good indices that allow to get all information needed without fetching the whole database.
πŸ“ Parse, don’t validate is another one good blog post about writing better type annotations. I short, instead of checking the correctness of values everywhere (or even worse, implicitly assuming they are valid), check them once, convert it into a new type, and then accept that type in every place where an already validated value is expected. For example, in Python:

from typing import NewType

Username = NewType('Username', str)

def parse_username(raw: str) -> Username:
...

And then all other functions that expect a valid type should be annotated with Username.

There is more in this blog post, with examples, corner-cases, and all that stuff, so I recommend to read it. The code examples are on Haskell, but they should be easy to understand (if you're not easily scared by monads) and the techniques can be translated to the language of your choice.
πŸ“ Golang is evil on shitty networks is another reminder that networking is messy. The blog post and the related GitHub issue got some traction, and there are answers to it on the orange site from Russ Cox (the author of the code in question) and John Nagle (the author of the algorithm). It all together is a very interesting reading on networking issues and how it is solved in TCP.
Matching on negative literals results in a parser error. This Github issue is a good example of why I will never touch Elm. Elm is a very interesting functional language but gosh the governance is bad. The bug report is that Elm explodes if you try to use negative numbers in pattern matching, and this is the answer of Elm author:

> Why are you doing this? Instead of fake examples, can you explain how this comes up?

D-uh. There was a drama a long time ago that Elm banned using JS directly in third-party libraries (reserving this possibility for stdlib only), and that decision caused many people to leave the language community. I'm surprised anyone even stayed.

I guess the point of this post is that when picking a programming language, not only the language itself matters but also the community and the governance around it.

The Go core team is considering adding telemetry in Go toolchain (compiler, formatter, linter, language server, all that stuff) that is opt-in by default. I wonder how far this story would go, but I think it will be fine. There already were talks that Go is just another Google's product, they don't care about community, and generics (the initial proposal based on contracts) are getting pushed. And that story the Go team managed well. They canceled the controversial proposals, and what we now have in Go is much more community-accepted.

I could also rant about the Go community itself (and push-back on generics for years as an idea), but I should stop somewhere.
πŸ“ Functional Programming in C++ is a (mirror of) blog post from John Carmack about using functional approach in, well, any OOP language, actually, not just C++. I like how well-thought, practical, and balanced his vision on this. He covers pure functions and immutability, their trade offs, and how to fit into the messy OOP world. I think it's a good intro into the world of functional programming, just enough to get interested and already start using it at work.
TLA+ is a formal specification language to describe design of distributed systems and then let the computer find issues with that design. The name stands for "Temporal Logic of Actions". "Logic" means that you mostly work with boolean values and operations (you also have sequences, numbers, sets, and functions, though). "Temporal" means you can describe how your system changes over time (something will eventually happen or something always happens). And finally "Actions" means that your system is described as a state machine of states and transitions between them.

It is designed by Leslie Lamport (he's not the one who made the IDE and verifier, though), the creator of LaTeX. Hence the syntax of TLA+ is basically LaTeX. It's designed for mathematicians rather than programmers, and so describing imperative operation in it is mind-bending. So, Lamport later also created PlusCal, an imperative DSL that generates TLA+ code. TBH, it's all a big mess and far cry from modern programming languages, their syntax, and conveniences, but it gets the job done.

The best place to learn TLA+ and PlusCal is learntla.com by Hillel Wayne, his blog, and his Practical TLA+ book. His materials have the benefit of being designed for engineers, rather than any math-heavy and obscure-symbols-loaded materials from Lamport. BTW, the book is dirt cheap, it almost feels like Hillel pays half of the production price for you. He himself says that he made learntla.com because he didn't like that his book costs money.

Don't worry about any materials being ever outdated, TLA+ doesn't get any updates for 9 years.