mccue.devhttps://mccue.devclj-rssThe Ultimate Guide to Data Structures and Algorithms (DSA)https://mccue.dev/pages/1-22-25-the-ultimate-guide-to-data-structures-and-algorithms This guide has two sections, one for employers and one for prospective employees. We'll start with employees. ## For Employees ### Step 1. Obtain a copy of "Algorithm Design" by Jon Kleinberg and Éva Tardos. Any edition is fine. ### Step 2. Follow structured coursework on that book or otherwise go through it in a way that you can manage. Slowly work your way through while living your life. [Here](https://www.cs.princeton.edu/~wayne/kleinberg-tardos/) are some slides and [here](https://www.edx.org/learn/algorithms/stanford-university-algorithms-design-and-analysis-part-1) is an edX course from standford. If this does not satisfy you, seek out a book of similar character. ### Step 3. Branch out. I understand why you are stressed about "Data Structures and Algorithms," but I guarantee that time spent elsewhere is going to be more valuable for you. If you've done step 1 and 2 you have enough background info to begin to tackle whatever hard problems you run into in practice, focus instead on getting practical experience building things. It doesn't matter if those things are trivial either: hastily writing a jank calorie tracker website or virtual shrine to Edward Cullen will provide more long term value to you than spending your evenings grinding Leetcode. Only do "Competitive Programming" if it is truly an interest of yours. It's fine if it is, but you don't need to do that kind of stuff to be qualified to perform most software jobs. Even the "prestigious" ones. ## For Employers ### Step 1. Please, for the love of god, cut the shit. If the position you are hiring someone for does not require implementing or understanding Gale-Shapley or reversing a linked list, do not make testing someone on that part of the interview process. I do not care if you fancy yourself an "elite institution" and want to "uphold standards." It's the TSA of hiring practices. All theater, no usefulness. When people know that's what they'll be tested on they spend time practicing for the test and not their actual responsibilities. This helps no one. ### Step 2. Rework your hiring processes to be more holistic. This will cost time and energy, but it will cost orders of magnitude less time and energy than having employees who can balance a trie but store passwords in plaintext. If you don't know where to start, give your technical interviewers this mind map and the explanation following it. [PDF link here](/pages/1-22-25-interview-mind-map.pdf). <img src="/pages/1-22-25-interview-mind-map.png" alt="Mind map of topics to cover in an interview" /> > It's mostly a memory aid for making sure I cover topics in an interview but it's also an ordered guide. I start at the top right and work my way down that side. I drill into the branches if a) the candidate is enthusiastic about a topic or b) they are very reticent. Once I get down the right hand side (mostly "soft" skills stuff), I move to the top left and work my way down that side (more technical/process skills stuff). I skip any topic they've already covered. I print out a fresh map for each candidate and make brief notes around the edge of the page. > > The "conflict resolution" branch tends to come up in the "worst project" area but it's there as a reminder to make sure it's covered if they haven't mentioned it elsewhere. > > Overall, it was written as a general guide, not specific to any particular programming language, so you have to play it by ear somewhat if you're interviewing for a senior FP role and you're only interested in FP, for example, or if you're interviewing for, say, a scrum master, or an ops role -- anything really specialized. > > Prior to the interview, I'll also use the map as a guide for highlighting things on their resume/CV that I want to dig into during the interview -- I may highlight parts of the resume or add "pre-notes" to their map, and use both side-by-side. > > I try to couch all of it in "tell me about ..." open-ended questions and avoid quizzing them on specific technology as much as possible. If they don't mention some of the specific tech organically that I want to hear about, I will "guide" them back to that at appropriate points. The diagram and explanation comes from [Sean Corfield](https://github.com/seancorfield). He does not know I am writing this or quoting him. ### Step 3. Stop paying Leetcode and companies like them for candidates. It is in their best interests to feed slop into the slop trough. Do not gobble their slop.Wed, 22 Jan 0025 05:00:00 +0000How to use SDL3 from Javahttps://mccue.dev/pages/12-26-24-sdl3-java Using native code libraries from Java is both easier than its ever been and still a little bit frustrating. To understand that duality I wrote this pretty basic tutorial on how to use the newly-ish released [SDL3](https://wiki.libsdl.org/SDL3/FrontPage) from Java. This should be useful both for those invested in the Java ecosystem and those who have a more practical desire to use SDL3 for their own projects. While all I am going to do will be focused on Java, feel free to generalize to Clojure, Kotlin, Scala, or [flix](https://flix.dev/)<sup><a href="#1">1</a></sup>. ## Prerequisites - Git - SDKMan (`curl -s "https://get.sdkman.io" | bash`) - JExtract (`sdk install jextract`) - [Java 22+](https://adoptium.net/temurin/releases/?version=23) - Linux-like system. (Just to make this easier for me to write and test.) - [Just](https://github.com/casey/just) (Or just run the commands. I like using this.) ## Tutorial ### 0. Make a Hello World project ``` src/ Main.java ``` ```java public class Main { public static void main(String[] args) { System.out.println("Hello, world"); } } ``` As part of this, make a `Justfile` with a recipe to run the project. ``` help: just --list run: java src/Main.java ``` ### 1. Clone and Build SDL SDL is one of those dependencies you still get from source. There are platform specific ways to get this and other native libraries, but those are all nightmares in their own right. You can find the build instructions for your platform [here](https://wiki.libsdl.org/SDL3/Installation). ``` help: just --list # Clone and Build SDL sdl: rm -rf SDL git clone https://github.com/libsdl-org/SDL cd SDL && mkdir build cd SDL/build && cmake -DCMAKE_BUILD_TYPE=Release .. cd SDL/build && cmake --build . --config Release --parallel cd SDL/build && sudo cmake --install . --config Release run: java src/Main.java ``` Whether you want to keep SDL as a git submodule, clone it fresh every time, or something else is up to you. On my machine (M1 Mac) running the build outputs some warnings which leads to a non-zero exit code. While annoying, it does manage to finish the build so whatever. So long as you don't see errors you should be fine. ### 2. Generate Java Bindings To do this we will use `jextract`. ```shell jextract \ --include-dir SDL/include \ --dump-includes includes.txt \ SDL/include/SDL3/SDL.h ``` This will create a file with the command-line flags needed to include every symbol. This is useful so you can trim down the functions Java code will be generated for. Basically just go through this file and remove everything that doesn't start with `SDL_` or similar. Then, using that list of symbols to include, generate Java code. As part of this you should use the `--use-system-load-library` flag. This will generate the code such that it will pull `libsdl3` from the directly configurable `java.library.path`. ```shell jextract \ --include-dir SDL/include \ --output src \ --target-package bindings.sdl \ --library SDL3 \ --use-system-load-library \ @includes.txt \ SDL/include/SDL3/SDL.h ``` ### 3. Update Run Configuration In order to call into native code you need to pass a flag to enable native access. This is because, in general, calling arbitrary C code can crash or otherwise bork the JVM. Native access permissions are given per-module. By default (unless you make a `module-info.java`) your code will be on the unnamed module, so we will use `ALL-UNNAMED`. ```shell java --enable-native-access=ALL-UNNAMED src/Main.java ``` A known quirk of using a library like SDL on Mac is that you need to also pass `-XstartOnFirstThread`. On non-mac platforms I think you can leave this off. ```shell java \ -XstartOnFirstThread \ --enable-native-access=ALL-UNNAMED \ src/Main.java ``` And then we need to pass the path of our build SDL shared library. ```shell java \ -XstartOnFirstThread \ --enable-native-access=ALL-UNNAMED \ -Djava.library.path=SDL/build \ src/Main.java ``` If you are unfamiliar with `-Djava.library.path` - isn't that crazy? Consequence of build tools only caring about `--class-path` I think. ### 4. Make some calls to SDL The following is translated from [one of the SDL examples](https://github.com/libsdl-org/SDL/blob/main/examples/renderer/02-primitives/primitives.c). Note that while the C example its based on has some callbacks, in Java you need to manually implement those lifecycle bits. Also note the uses of `try/finally`. One big difference between C and Java is that Java has exceptions. If you want cleanup code to always run (such as `SDL_DestroyWindow`) you need to account for exceptions. ```java import bindings.sdl.SDL_Event; import bindings.sdl.SDL_FPoint; import bindings.sdl.SDL_FRect; import java.lang.foreign.Arena; import static bindings.sdl.SDL_h.*; public class Main { public static void main(String[] args) { try (var arena = Arena.ofConfined()) { SDL_SetAppMetadata( arena.allocateFrom("Example Renderer Primitives"), arena.allocateFrom("1.0"), arena.allocateFrom("com.example.renderer-primitives") ); if (!SDL_Init(SDL_INIT_VIDEO())) { System.err.println( "Couldn't initialize SDL: " + SDL_GetError().getString(0)); return; } var windowPtr = arena.allocate(C_POINTER); var rendererPtr = arena.allocate(C_POINTER); if (!SDL_CreateWindowAndRenderer( arena.allocateFrom("examples/renderer/clear"), 640, 480, 0, windowPtr, rendererPtr )) { System.err.println( "Couldn't create window/renderer: " + SDL_GetError().getString(0)); return; } var window = windowPtr.get(C_POINTER, 0); var renderer = rendererPtr.get(C_POINTER, 0); try { int numberOfPoints = 500; var points = SDL_FPoint.allocateArray(numberOfPoints, arena); for (int i = 0; i < numberOfPoints; i++) { var point = SDL_FPoint.asSlice(points, i); SDL_FPoint.x( point, (SDL_randf() * 440.0f) + 100.0f ); SDL_FPoint.y( point, (SDL_randf() * 280.0f) + 100.0f ); } var event = SDL_Event.allocate(arena); var rect = SDL_FRect.allocate(arena); program: while (true) { while (SDL_PollEvent(event)) { var type = SDL_Event.type(event); if (type == SDL_EVENT_QUIT()) { System.err.println("Quitting"); break program; } } /* as you can see from this, rendering draws over whatever was drawn before it. */ SDL_SetRenderDrawColor( renderer, (byte) 33, (byte) 33, (byte) 33, (byte) SDL_ALPHA_OPAQUE() ); /* dark gray, full alpha */ SDL_RenderClear(renderer); /* start with a blank canvas. */ /* draw a filled rectangle in the middle of the canvas. */ SDL_SetRenderDrawColor( renderer, (byte) 0, (byte) 0, (byte) 255, (byte) SDL_ALPHA_OPAQUE() ); /* blue, full alpha */ SDL_FRect.x(rect, 100); SDL_FRect.y(rect, 100); SDL_FRect.w(rect, 440); SDL_FRect.h(rect, 280); SDL_RenderFillRect(renderer, rect); /* draw some points across the canvas. */ SDL_SetRenderDrawColor( renderer, (byte) 255, (byte) 0, (byte) 0, (byte) SDL_ALPHA_OPAQUE() ); /* red, full alpha */ SDL_RenderPoints(renderer, points, numberOfPoints); /* draw a unfilled rectangle in-set a little bit. */ SDL_SetRenderDrawColor( renderer, (byte) 0, (byte) 255, (byte) 0, (byte) SDL_ALPHA_OPAQUE() ); /* green, full alpha */ SDL_FRect.x( rect, SDL_FRect.x(rect) + 30 ); SDL_FRect.y( rect, SDL_FRect.y(rect) + 30 ); SDL_FRect.w( rect, SDL_FRect.w(rect) - 60 ); SDL_FRect.h( rect, SDL_FRect.h(rect) - 60 ); SDL_RenderRect(renderer, rect); /* draw two lines in an X across the whole canvas. */ SDL_SetRenderDrawColor( renderer, (byte) 255, (byte) 255, (byte) 0, (byte) SDL_ALPHA_OPAQUE() ); /* yellow, full alpha */ SDL_RenderLine(renderer, 0, 0, 640, 480); SDL_RenderLine(renderer, 0, 480, 640, 0); SDL_RenderPresent(renderer); /* put it all on the screen! */ } } finally { SDL_DestroyRenderer(renderer); SDL_DestroyWindow(window); SDL_Quit(); } } } } ``` Run the code with the flags discussed above. You should see a window pop up with a rectangle and some dots. <img src="./12-26-24-sdl-window.png"></img> ## Annoyances So while this is easy from top to bottom, there are some interesting properties you should be aware of. ### 1. `SDL_h` doesn't actually have all the functions. I assume because of limits on class size, with a library the size of SDL the `jextract`-generated binding code is split over multiple files. `SDL_h`, `SDL_h_1`, `SDL_h_2`, etc. This isn't an issue normally since you can just add more static imports, but it can be an issue for binary compatibility. If you end up directly accessing the static properties of `SDL_h_2` you might be in for a bad surprise if those symbols end up in a different class file when you next update. It's not a problem when the generated binding code is just part of your build, but it is an issue if you wanted to make a stable `sdl` artifact to share with other people. ### 2. The generated Java code is per-platform. Java doesn't actually have a C api - it has a "foreign function and memory" api. This means that the descriptions of native memory layouts include platform specific padding. `jextract` uses `clang` to figure out what the memory layouts for structs are and dumps those as part of its generated code. This means that if you want to use `jextract` generated code across different platforms you need to either make distinct artifacts per-platform or handle things dynamically at runtime. An easy way to get access to different platforms (which you should think of as target triples - `(operating system, architecture, libc)`) is via GitHub actions. Exercise for the reader on how to integrate that, [though I have one example](https://github.com/bowbahdoe/tui/blob/main/.github/workflows/mac_aarch64.yaml). ### 3. It would be work to share this Distributing a library which uses C code over the usual Java library channels like Maven Central can be annoying. The standard build tools (maven, gradle, etc.) do not provide an easy way to set `java.library.path` or get dependencies that should go there. What most people have historically done is to embed one or multiple shared libraries in a jar and extract them at runtime.<sup><a href="#2">2</a></sup> This is some bunk, but kinda the lowest common denominator approach. You can read [some of that nonsense here](https://github.com/fusesource/jansi/blob/master/src/main/java/org/fusesource/jansi/internal/JansiLoader.java). The binary compatibility issues alluded to above would also be something to consider. All that is to say - don't go just publishing libraries for every C library you want bindings for. At least at the moment there are some caveats and the ecosystem isn't super ready for it. If you are going to do that, provide a layer on top of the auto-generated `jextract` code. Have some value that is worth the time investment. ### 4. You need to care about memory lifetimes In the example code there is only one `Arena` and it lives for the entire program. If you have a need to allocate memory for a shorter timespan you'll need to make at least one allocator. While this is better than directly dealing with `malloc`/`free` it is still more responsibility than you usually have in Java code. It can be tempting to build APIs that use foreign memory the same as if they did not, but unless you have a really clear seam behind which to hide the memory shenanigans it is probably going to backfire. ## Conclusion If you want to make a game or game engine, this should be a good start. You can pivot to more C oriented SDL tutorials and translate the calls needed to open windows, render graphics, etc. If you want to do something similar for another native library, this should serve as a decent starting point. You can find the code for this demo [here](https://github.com/bowbahdoe/sdl-java-tutorial). <p id="1" style="font-size: 14px">1: Sidenote, but I think flix could kill Scala as the JVM language for Haskell-likin-types. Other than implicits + some basic type level stuff: Java has already taken or will take most of what made Scala interesting. "Stratified Negation," "Lattice Semantics," and "Associated Effects" intimidate me in a way I haven't felt in a while.</p> <p id="2" style="font-size: 14px">2: There is one method of distribution that doesn't have these problems: <i>.jmod</i>. JMods have a special place for shared libraries and will merge them in to a JDK it's linked with. I'm investigating the possibilities of that on the side.</p> Thu, 26 Dec 0024 05:00:00 +0000Java Build Scriptshttps://mccue.dev/pages/8-26-24-java-build-scripts I've written before about how I think that while they need some bolstering ([here](https://mccue.dev/pages/1-11-24-cli-flow)) using the CLI tools to build Java code is more practical than you might think ([here](https://mccue.dev/pages/5-29-24-module-compilation) and [here](https://mccue.dev/pages/5-30-24-module-libs-and-tests)). What I didn't talk about, or tip-toed around, is that writing build scripts in `bash`, `PowerShell`, `cmd.exe`, etc. is not very cross-platform. You can install `bash` on Windows and run in [WSL](https://learn.microsoft.com/en-us/windows/wsl/install), but that feels unideal. An extra setup step is one thing, but needing to ask students who just learned that the command line exists to also make sure they aren't accidentally running in `PowerShell` is painful. You could also just ignore the problem. "Real" developers use Mac or Linux, right? Well those same developers sometimes pick an extra special shell for themselves like `zsh`, `nushell`, or `fish`. You have a similar, if less serious, problem. Using any of those shells for experimenting or testing out commands is fine. Until you write `$()` or a file path they are all more or less the same. What we really need is some way of writing out commands that will work on Windows, Mac, and Linux and regardless of if someone is using `bash`, `Powershell`, `cmd.exe`, `zsh`, `nushell`, or `fish`. If only we had a language that could be written once and then ran anywhere. ## just Since Java 22 we've been able to write `java Main.java` and execute a potentially multi-file program. Before that there was a significant bootstrap problem. If you write Java code to compile Java code, who compiles that Java code? Now that that's there, we can start to consider what it would look like to run a CLI tool from Java code and compare that to the alternatives. The alternative I've been using is [`just`](https://github.com/casey/just). `just` is a command runner similar to `make` but without any of the caching `make` does or the wild syntax and history `make` is burdened with. You write the name of a "recipe", a `:` then an indented list of commands to run. ```just demo: javac --version jlink --version ``` For this, if you run `just demo` and then it will run each command in sequence. ```bash $ just demo javac --version javac 22.0.1 jlink --version 22.0.1 ``` If any command gives you a non-zero exit code it fails immediately. ```bash javac --v error: invalid flag: --v Usage: javac <options> <source files> use --help for a list of possible options ``` And, by default, it will echo the command its about to run before it runs it. All of these properties are useful for different reasons. * Printing the command before its run is useful when something fails. You can usually just copy-paste the command and tweak it until it works. Then you can just copy-paste the working command back in place. * Failing immediately on a non-zero exit code is a good default for what I feel are obvious reasons. * Being able to refer to a command or group of commands by an alias is almost required to do anything interesting. `just compile` is much more ergonomic to use than `javac -d build --module-source-path "./*/src --module example`. Also, and it feels small but isn't, you can get a list of all the commands + a comment on how to use them with `just --list`. ``` $ just --list Available recipes: demo ``` ## Run commands in Java I want all of these properties so let's see how we can get them in `Java`. To run a command we can use the `ProcessBuilder`. ```java import java.util.List; public class Project { public static void main(String[] args) { var cmd = List.of("javac", "--version"); var pb = new ProcessBuilder(cmd); } } ``` Then all we need to do is start the command, wait for it to finish, and record the exit status. If its non-zero throw. ```java public class Project { public static void main(String[] args) { var cmd = List.of("javac", "--version"); var pb = new ProcessBuilder(cmd); try { int exitStatus = pb.inheritIO().start().waitFor(); if (exitStatus != 0) { throw new RuntimeException( "Non-zero exit status: " + exitStatus ); } } catch (InterruptedException e) { throw new RuntimeException(e); } catch (IOException e) { throw new UncheckedIOException(e); } } } ``` Yeesh. So that sucks. We haven't even gotten to printing out the command or labelling groups of commands yet. ## 🚧Construction Zone 🚧 I've been noodling on this for a time and this weekend<sup><a href="#2 ">2</a></sup> I think I finally came up with a half-decent API. ```java import dev.mccue.tools.ExitStatusException; import dev.mccue.tools.Tool; public class Project { public static void main(String[] args) throws ExitStatusException { Tool.ofSubprocess("javac") .run("--version"); } } ``` This will print out the command to `System.err`, run it, throw an exception if needed, and pipe output to `System.out`/`System.err` as needed. Problem is that now we've introduced a dependency. Hand-waving how you get the dependency<sup><a href="#1">1</a></sup>, the command you need to run the code changes from `java scripts/Main.java` to `java --module-path scripts/libs --add-modules ALL-MODULE-PATH scripts/Main.java`. That's simply too much to remember. ### Argument Files To deal with this we can use argument files. If we make a file called `project` at the top level of our project with the following contents. ```bash --module-path scripts/libs --add-modules ALL-MODULE-PATH ``` Now we can run the script with just `java @project`. This works as if all the arguments in the file were applied inline in the invocation. Most tools that come with Java, including the `java` launcher itself, support this. ### picocli As for identifying groups of commands, there is a solution there too. Now that we've opened the floodgates on our Java build script having dependencies, what's one more? ```java import dev.mccue.tools.ExitStatusException; import dev.mccue.tools.Tool; import picocli.CommandLine; @CommandLine.Command( name = "project" ) public final class Project { public static void main(String[] args) { new CommandLine(new Project()).execute(args); } @CommandLine.Command(name = "demo") public void demo() throws ExitStatusException { Tool.ofSubprocess("javac") .run("--version"); } } ``` If we use `picocli`, then it's trivial. Our build script is a CLI program like any other, why not use normal CLI libraries? We get the ability to run commands by name. ``` $ java @project demo javac --version javac 22.0.1 ``` And we even get to list commands in a way. ``` $ java @project Missing required subcommand Usage: project [COMMAND] Commands: demo ``` Yay! ### Tool Tailored APIs While `Tool.ofSubprocess("javac").run("--version");` is complete in a sense, it's not very fun to use. What we generally want from a Java API is method-level autocomplete. Having to separately reference a man page doesn't spark joy in me, and it shouldn't spark joy in you. I started this particular adventure wanting to translate options from the CLI more or less 1-1. This is for two broad reasons. 1. I think learning how to use the CLI from Java should be transferable knowledge when writing commands the old fashioned way and vice-versa. 2. There are a lot of CLI tools and coming up with a creative name for every argument that can only be specified as `-g` is painful and a lot of work. The transform I started with was for every argument that `--looks-like-this` I would add a method to an arguments object that `looksLikeThis`. ```java Javadoc.run(arguments -> { arguments .moduleSourcePath("./modules/*/src") .d(Path.of("build/javadoc")) .module("dev.mccue.tools") }); ``` This works pretty well, but look at these two options from `javadoc`. ``` javadoc --help ... --version Print version information ... -version Include @version paragraphs ... ``` It has both `-version` and `--version` and they do wildly different things. Great. Awesome. This is a one-off example, but CLI tools are fundamentally textual apis. `--some-thing` to `someThing` isn't just a stylistic change, it's a lossy transformation. So I gave up. My only strategy now is to take arguments that `--look-like-this` and turn them into ones that `__look_like_this`. It might be ugly, but at least I don't run into strange problems anymore. As a side benefit, it does now look a lot more 1-1 with the CLI api. ```java Javadoc.run(arguments -> { arguments .__module_source_path("./modules/*/src") ._d(Path.of("build/javadoc")) .__module("dev.mccue.tools") }); ``` ## Conclusion I translated the spring demo repo I was using for some previous posts to use this approach. It includes running junit tests and managing multiple modules. You can find it [here](https://github.com/bowbahdoe/java-project-spring-demo) and the build script specifically [here](https://github.com/bowbahdoe/java-project-spring-demo/blob/main/scripts/src/Project.java). Note that the libraries I referenced, save for picocli, are very likely to change in backwards-incompatible ways as I iterate on them. Don't use them for anything serious yet, but you can find them [here](https://central.sonatype.com/artifact/dev.mccue/tools-jdk). There are still some bootstrap and polish issues, but I think this approach is becoming more and more viable as its chipped away at. --- Share thoughts, design feedback, etc. in the comments below. <p id="1" style="font-size: 14px">1: <code>jresolve --output-directory scripts/libs pkg:maven/dev.mccue/tools-jdk@2024.08.25.5</code>.</p> <p id="2" style="font-size: 14px">2: This approach/outlook on tooling has a lot of similarities to bach and the work Christian Stein has been doing. Will likely elaborate more on the difference between this approach, bach's approach, bld's approach, etc. when I personally have more mental clarity on it.</p>Mon, 26 Aug 0024 05:00:00 +0000C Growable Arrays: In Depthhttps://mccue.dev/pages/8-21-24-c-growable-arrays-in-depth An extremely common question people have when using `C` for the first time is how to make an array of elements that can grow over time. I know this is a common question because one older post on this website where I explained the concept (badly) gets tons of organic traffic. It's not a bad question either. Nearly every language you might be coming at `C` from has an equivalent. * Python: `list` * JavaScript: `Array` * Java: `ArrayList` * ... etc. And, ignoring primacy, `C` classes often have students make data structures for their assignments. So I figure it might be useful to at least one person to give a walkthrough of how that data structure works in the `C` world. Just keep in mind that I am not a professional `C` programmer. If I get anything wrong or there is something you wish I mentioned, feel free to mention it in the comments below or wherever. I'll make corrections. ## Arrays An array is a fixed size collection of data. ```c int numbers[] = { 1, 2, 3 }; ``` Being fixed size means that if an array starts out with enough space for 3 elements, there is no way to make space for a 4th. C arrays are also, more or less, equivalent to pointers that just so happen to point to the start of a chunk of memory. So whenever you see something like `int[]` in code you can mentally translate that to `int*`. Most languages' array-equivalents can have their size queried at runtime. `C` is a bit special in that there is no way to recover the number of elements in an array after you make it. It is just a pointer to a chunk of memory after all. This means you have two basic options for being able to figure out the size of an array. ### 1. Have a sentinel terminate the array One way to be able to figure out the size of an array is to put a special sentinel value as its last element. Code working with the array can then proceed forward until that special value is reached. This may or may not be an option depending on the kind of data being stored in an array. The most common use of this actually comes from how `C` stores strings. ```c #include <stdio.h> int main() { char* hello1 = "Hello"; char hello2[] = { 'H', 'e', 'l', 'l', 'o', '\0' }; printf("%s\n", hello1); printf("%s\n", hello2); return 0; } ``` "C style strings" are an array of characters terminated by a null character. If you want to find the length of something like this, just keep looping until you get to that terminator. ```c #include <stdio.h> int main() { char* hello = "Hello"; int i = 0; while (hello[i] != '\0') { printf("%c\n", hello[i]); i++; } return 0; } ``` An upside to this approach is that its simple to understand. A downside is that you need to go through every element in the array to find its size, which is a pain. ### 2. Store it when you make the Array If we don't want to have the null terminator we need to store a number. One way to do this is to just manually count out how big an array is. ```c #include <stdio.h> int main() { int numbers[] = { 6, 4, 7 }; // I counted it with my eyes int numbers_size = 3; int i = 0; while (i < numbers_size) { printf("%d\n", numbers[i]); i++; } return 0; } ``` If when you initialize your array you write the number of elements directly, you can make use of `sizeof` to calculate the size. ```c #include <stdio.h> int main() { int numbers[3] = { 6, 4, 7 }; // Because of the literal [3] above, C can figure out // how many elements there are by dividing the total // size of the array by the size of an individual // element. int numbers_size = sizeof(numbers) / sizeof(numbers[0]); int i = 0; while (i < numbers_size) { printf("%d\n", numbers[i]); i++; } return 0; } ``` You then need to handle passing that size around with the array whenever you give it to a function that takes an array as an argument. A good deal of the `C` standard library works like this. ## `size_t` Small digression. When you make a variable to store an index into array or that stores the size of an array you are intended to use `size_t`. If you don't it seems like its usually "fine," but I wouldn't risk the wrath of the undefined behavior demons. To have `size_t` be available you should put `#include <stddef.h>` at the top of your program. ```c #include <stddef.h> #include <stdio.h> int main() { int numbers[3] = { 6, 4, 7 }; size_t numbers_size = sizeof(numbers) / sizeof(numbers[0]); size_t i = 0; while (i < numbers_size) { printf("%d\n", numbers[i]); i++; } return 0; } ``` ## Heap Allocation At runtime, you can get an arbitrarily large block of memory in various ways. The most commonly known is `malloc`. You give it a size then it gives you a pointer to the start of that memory. You need `#include <stdlib.h>` to use it. ```c #include <stddef.h> #include <stdio.h> #include <stdlib.h> int main() { size_t numbers_size = 3; int* numbers = malloc(sizeof(int) * numbers_size); numbers[0] = 6; numbers[1] = 4; numbers[2] = 7; size_t i = 0; while (i < numbers_size) { printf("%d\n", numbers[i]); i++; } return 0; } ``` The one we will use is `calloc`. It works mostly the same as its cousin `malloc` with two major differences. The first is that you don't give it the full size of the array you want. You give it the size of each element and the number of elements you want seperately. ```c #include <stddef.h> #include <stdio.h> #include <stdlib.h> int main() { size_t numbers_size = 3; int* numbers = calloc(sizeof(int), numbers_size); numbers[0] = 6; numbers[1] = 4; numbers[2] = 7; size_t i = 0; while (i < numbers_size) { printf("%d\n", numbers[i]); i++; } return 0; } ``` The second is that the memory returned is already "zeroed." This means that you know that every element is in its zero-valued state. So for `int` it will literally be `0`, `_Bool`s will be `false`, pointers will be `NULL`, etc. Often that doesn't matter but, because there isn't "uninitialized" memory with random data in it, it feels more predictable. For both approaches you need to later `free` that allocated memory. You will be technically exempt from needing to do this if your program doesn't run for long enough to run out of memory. I think it is best to be a "good citizen" and `free` your memory regardless. ```c #include <stddef.h> #include <stdio.h> #include <stdlib.h> int main() { size_t numbers_size = 3; int* numbers = calloc(sizeof(int), numbers_size); numbers[0] = 6; numbers[1] = 4; numbers[2] = 7; size_t i = 0; while (i < numbers_size) { printf("%d\n", numbers[i]); i++; } free(numbers); return 0; } ``` Technically speaking `malloc`, `calloc`, etc. can fail if the system is out of memory. We are going to ignore that possibility for the rest of this, but the lower level software you write the larger chance you will need to care about very limited memory scenarios. ## Single Type Growable Arrays The basic concept of a growable array is to group three pieces of information. A pointer to an array of things, the number of elements allocated for that array, and the number of elements "actually" in the array. ```c struct GrowableIntArray { int* data; size_t allocated; size_t size; } ``` So for a new array with nothing in it, the `data` pointer would be null and both numbers would be `0`. ```c struct GrowableIntArray growable_int_array_empty() { struct GrowableIntArray empty = { .data = NULL, .allocated = 0, .size = 0 }; return empty; } ``` To add an element to the array, you check if adding the element would make the size of the array larger than what was allocated. If it won't, you set an element in your `data` array and bump the size. If it will, you need to make a new array that is bigger than the last one. How much bigger is more art than science, but generally people find success allocating around twice as many elements as were there before. That sounds crazy, but at worst you are only wasting half your memory. That's not that bad in the scope of things. Then you copy over all the elements from the last array and free the old one. ```c void growable_int_array_add( struct GrowableIntArray* array, int value ) { // If we wouldn't have enough room if (array->size + 1 > array->allocated) { // Double the size of the last array size_t new_allocated; if (array->size == 0) { new_allocated = 2; } else { new_allocated = array->size * 2; } // Make a new array that size int* new_data = calloc(sizeof(int), new_allocated); int* old_data = array->data; // Copy all the old elements to it if (old_data != NULL) { for (size_t i = 0; i < array->size; i++) { new_data[i] = old_data[i]; } } // Then free the old array free(old_data); // And patch up the pointers array->data = new_data; array->allocated = new_allocated; } // And put in the new element array->data[array->size] = value; array->size++; } ``` Which is a chunky function, but now you should be good to go on making something which you can use as an array but which dynamically grows as elements are added. ```c int main() { struct GrowableIntArray numbers = growable_int_array_empty(); growable_int_array_add(&numbers, 6); growable_int_array_add(&numbers, 4); growable_int_array_add(&numbers, 7); size_t i = 0; while (i < numbers.size) { printf("%d\n", numbers.data[i]); i++; } return 0; } ``` From there it's all a matter of personal taste. Many would want to implement their own `growable_int_array_size` and `growable_int_array_get`. Both of these are relatively straight forward and useful if your goal is to avoid accessing struct members directly ```c size_t growable_int_array_size( struct GrowableIntArray* array ) { return array->size; } int growable_int_array_get( struct GrowableIntArray* array, size_t i ) { // You can do precondition checks and crash early if someone // tries to out of bounds if you want. return array->data[i]; } ``` ```c int main() { struct GrowableIntArray numbers = growable_int_array_empty(); growable_int_array_add(&numbers, 6); growable_int_array_add(&numbers, 4); growable_int_array_add(&numbers, 7); size_t i = 0; while (i < growable_int_array_size(&numbers)) { printf("%d\n", growable_int_array_get(&numbers, i)); i++; } return 0; } ``` But all of this has a **major** flaw. Do you see it? It only works with `int`s! If you want to have a growable array of `long`s or `Position`s or whatever, you need to copy and paste all of this code, change the types around, and make brand-new functions. What we want is the ability to write code for a growable array once and then have that work for any kind of data we want to store. That gives leaves us with two options. 1. Make a growable array that can be used for anything at runtime 2. Make a growable array that can be specialized for anything at compile-time. ## Runtime Generic Growable Arrays What do an `int`, a `char` and a `struct Position` have in common? Nothing. Save some really strange layout choices by a compiler, all of these data types require different amounts of memory. What do an `int*`, a `char*`, and a `struct Position*` have in common? Turns out all of them can be safely converted to and from a `void*`. ```c #include <stdio.h> int main() { int eight = 8; int* eightPointer = &eight; void* voidPointer = (void*) eightPointer; eightPointer = (int*) voidPointer; printf("%d\n", *eightPointer); return 0; } ``` A `void*` is a pointer to "something." The `C` compiler forgets what kind of information is actually stored in it. All pointers in `C` have the same size, so now we have our way of storing anything. ```c struct GrowableArray { void** data; size_t allocated; size_t size; } ``` At first, it might seem like we can just do that and find+replace `int` with `void*` in the code from before. And you'd be right. Just be aware that things which were once `int*` will become `void**`. A pointer to an array of `void` pointers. ```c void growable_array_add( struct GrowableArray* array, void* value ) { // If we wouldn't have enough room if (array->size + 1 > array->allocated) { // Double the size of the last array size_t new_allocated; if (array->size == 0) { new_allocated = 2; } else { new_allocated = array->size * 2; } // Make a new array that size void** new_data = (void**) calloc( sizeof(void*), new_allocated ); void** old_data = array->data; // Copy all the old elements to it if (old_data != NULL) { for (size_t i = 0; i < array->size; i++) { new_data[i] = old_data[i]; } } // Then free the old array free(old_data); // And patch up the pointers array->data = new_data; array->allocated = new_allocated; } // And put in the new element array->data[array->size] = value; array->size++; } ``` ### Usability The first problems that will arise are around usability. To pass a pointer in, it can't be an `rvalue`. An `rvalue` is something that should go on the *right* hand size of an equals sign. That's where the `r` comes from. This means that you can't just directly pass in a pointer to an `int`. ```c growable_array_add(&numbers, &6); ``` `&6` doesn't have a meaning to `C`. You need to have constant values first assigned to a variable. ```c int n = 6; growable_array_add(&numbers, &n); ``` This can be annoying to write out, but you might get used to it. Even harder to come to terms with is needing to recover the type of a pointer whenever you get it out. You need to both convert the `void*` to an `int*` or whatever actual type you stored and, if it's something like `int`, dereference that pointer to get at the actual value. ```c int value = *((int*) growable_array_get(&numbers, i)); ``` The `C` compiler doesn't take kindly to mishandled `void*`s. If you get this wrong you get teleported to Florida. ```c int main() { struct GrowableArray numbers = growable_array_empty(); int a = 6; int b = 4; int c = 7; growable_array_add(&numbers, &a); growable_array_add(&numbers, &b); growable_array_add(&numbers, &c); size_t i = 0; while (i < growable_array_size(&numbers)) { printf("%d\n", *((int*) growable_array_get(&numbers, i))); i++; } return 0; } ``` ### Pointer Lifetimes Pointers don't all "live" the same amount of time. You can take a pointer to a local variable, but that pointer is only valid so long as you are still within that function. ```c int* example() { int x = 5; int* xPointer = &x; // Can use xPointer freely // But if you return the pointer out it won't // be valid return xPointer; } ``` You can make pointers that live longer with `calloc`, but you later need to call `free` on them. ```c int* example() { int* xPointer = calloc(sizeof(int), 1); *xPointer = 5; // Valid to return, but something eventually // should free it. return xPointer; } ``` This presents a problem for our array of `void*`s. If all the pointers are pointing to local variables on the stack then your cleanup should just be to call `free` on `array.data`. ```c void growable_array_cleanup(struct GrowableArray array) { free(array.data); } ``` But if the pointers are pointing to heap allocated memory then someone needs to clean them up later. ```c void growable_array_cleanup(struct GrowableArray* array) { for (size_t i = 0; i < array->size; i++) { free(array->data[i]); } free(array->data); } ``` Even worse than that, some things don't just need to be `free`-ed. They might have been allocated outside the `calloc`/`free` system or, like our `GrowableArray`, they might have some custom cleanup process. To deal with the sheer variety of situations we need to store how we want to clean up elements in the array itself. ```c struct GrowableArray { void** data; size_t allocated; size_t size; void (*cleanup)(void*); } ``` This syntax - `void (*cleanup)(void*)` - is how you declare a pointer to a function. In this case a function whose return type is `void` and whose sole argument is a `void*`. If it looks confusing to you don't worry. It confuses me too. ```c struct GrowableIntArray growable_array_empty( void (*cleanup)(void*) ) { return growable_array_empty(NULL); } struct GrowableArray growable_array_empty( void (*cleanup)(void*) ) { struct GrowableArray empty = { .data = NULL, .allocated = 0, .size = 0 .cleanup = cleanup }; return empty; } ``` Once you have the cleanup function stored you can make a general cleanup function for the growable array itself. ```c void growable_array_cleanup(struct GrowableArray* array) { if (array->cleanup != NULL) { for (size_t i = 0; i < array->size; i++) { array->cleanup(array->data[i]); } } free(array->data); } ``` If their data is on the stack, you skip trying to `free` it. If it needs to be `free`-ed that can be done same as if you need to call `special_framework_destroy`. While this might seem like we've solved the problem, notice that no matter what we need to now track when to call a special `growable_array_cleanup`, each array has an extra pointer of memory, and has to check if `array->cleanup != NULL` at close. Everything comes at a cost. ### Memory Locality Following the same theme, this sort of structure is forced to have subpar memory locality. If you were to make an `int` array, the memory would be laid out like this with each `int` being directly next to the others. ```c ------------- | 5 | 4 | 3 | ------------- ``` When we make an array of `int` pointers the memory layout looks like this. ```c ------------------- | ptr | ptr | ptr | ---|-----|-----|--- V | | 5 V | 4 V 3 ``` Modern CPUs love going through arrays in order. They hate following pointers. This memory layout is almost guaranteed to lead to subpar performance compared to the tightly packed array. Notice also that we didn't choose this memory layout because we wanted to. We chose it because we didn't want to write out the data structure more than once. ## Compile-Time Generic Growable Arrays If we don't want everything behind a `void*` we need a perfect vinaigrette of clever and stupid. ### Template Headers The only things that need to change between a growable `int` array and a growable `struct Position` array are the struct names, function names, return types, and arguments. ```c struct GrowableIntArray { int* data; size_t allocated; size_t size; }; struct GrowablePositionArray { struct Position* data; size_t allocated; size_t size; } ``` But we don't need to do that by hand. We have the `C` preprocessor. ```c #define GROWABLE_ARRAY_STRUCT struct GrowableIntArray #define GROWABLE_ARRAY_DATA_POINTER int* GROWABLE_ARRAY_STRUCT { GROWABLE_ARRAY_DATA_POINTER data; size_t allocated; size_t size; }; ``` If you've made a `C` header file before you've probably seen a prelude like this. ```c #ifndef SOME_FILE_H #define SOME_FILE_H // ... CODE FOR HEADER HERE ... #endif ``` The purpose of this is so that if more than one file includes the header the code for it only shows up once. Here we don't want to do that. We want it to be able to be included multiple times in one compilation. In our `growable_array.h` we want to assume that `GROWABLE_ARRAY_STRUCT`, `GROWABLE_ARRAY_DATA_POINTER`, and whatever else we need defined are already defined by whatever code is including the header. `#ifndef` and `#error` can give some basic guardrails for that. ```c #ifndef GROWABLE_ARRAY_STRUCT #error "GROWABLE_ARRAY_STRUCT not defined" #endif #ifndef GROWABLE_ARRAY_DATA_POINTER #error "GROWABLE_ARRAY_DATA_POINTER not defined" #endif GROWABLE_ARRAY_STRUCT { GROWABLE_ARRAY_DATA_POINTER data; size_t allocated; size_t size; }; ``` Then we make one header file for each "specialization" we want of the growable array. So for `int`s we would make `growable_int_array.h` and put something like the following in it. ```c #ifndef GROWABLE_INT_ARRAY_H #define GROWABLE_INT_ARRAY_H #define GROWABLE_ARRAY_STRUCT struct GrowableIntArray #define GROWABLE_ARRAY_DATA_POINTER int* #include "growable_array.h" #undef GROWABLE_ARRAY_STRUCT #undef GROWABLE_ARRAY_DATA_POINTER #endif ``` First, the normal header prelude. We don't want the growable int array to be defined more than once. Then we define the variables needed for our template header, include that header, and `#undef` those variables afterward. The reason we bother with `#undef` is the same reason this works in the first place. The `C` preprocessor just does text replacements. When we include `growable_array.h` it literally spits the contents of that file in place. If we don't `#undef` a variable we defined it can lead to some head-scratchers compiling some other file. But now all other code needs to do is include `growable_int_array.h` to get a growable array for `int`s. All we need to do to get a growable array for a specific type is do some `#define`s. Rinse and repeat for any other kind of growable array we want. ### Pointer Lifetimes Using `void*` might have forced us to always handle the lifetimes of those pointers and using `int` or whatever else without indirection lets us skip over memory management. Unfortunately memory management is a fact of life in `C`. If the kind of thing we are storing needs to be cleaned up we need to track that. ```c #include <stddef.h> #include <stdlib.h> #ifndef GROWABLE_ARRAY_STRUCT #error "GROWABLE_ARRAY_STRUCT not defined" #endif #ifndef GROWABLE_ARRAY_DATA_POINTER #error "GROWABLE_ARRAY_DATA_POINTER not defined" #endif #ifndef GROWABLE_ARRAY_STRUCT_POINTER #error "GROWABLE_ARRAY_STRUCT_POINTER not defined" #endif #ifndef GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME #error "GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME not defined" #endif GROWABLE_ARRAY_STRUCT { GROWABLE_ARRAY_DATA_POINTER data; size_t allocated; size_t size; }; void GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME( GROWABLE_ARRAY_STRUCT_POINTER array ) { #ifdef GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME for (size_t i = 0; i < array->size; i++) { GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME(array->data[i]); } #endif free(array->data); } ``` The good news is that with the template approach that tracking doesn't need to happen at runtime. The bad news is that it needs to happen in the `C` preprocessor. ### Implementor Experience You might notice that there wouldn't be a warning if `GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME` was not defined. There also is a dearth of good names to give these things. It's understandable to get tripped up by the difference between `GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME` and `GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME`. Best case scenario if you fill in one of these preprocessor defines wrongly is that your code doesn't compile. Worst case is that you get some insane and hard to debug behavior. There will also end up being more than a few `#define`s you need to make. I'm not making use of [concatenation](https://gcc.gnu.org/onlinedocs/cpp/Concatenation.html) for clarity, but even that doesn't trim the number down _that_ far. If we write out some of the other functions you will see how this can be burdensome. ```c #ifndef GROWABLE_ARRAY_STRUCT #error "GROWABLE_ARRAY_STRUCT not defined" #endif #ifndef GROWABLE_ARRAY_STRUCT_POINTER #error "GROWABLE_ARRAY_STRUCT_POINTER not defined" #endif #ifndef GROWABLE_ARRAY_DATA #error "GROWABLE_ARRAY_DATA not defined" #endif #ifndef GROWABLE_ARRAY_DATA_POINTER #error "GROWABLE_ARRAY_DATA_POINTER not defined" #endif #ifndef GROWABLE_ARRAY_EMPTY_FUNCTION_NAME #error "GROWABLE_ARRAY_EMPTY_FUNCTION_NAME not defined" #endif #ifndef GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME #error "GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME not defined" #endif #ifndef GROWABLE_ARRAY_ADD_FUNCTION_NAME #error "GROWABLE_ARRAY_ADD_FUNCTION_NAME not defined" #endif GROWABLE_ARRAY_STRUCT { GROWABLE_ARRAY_DATA_POINTER data; size_t allocated; size_t size; }; GROWABLE_ARRAY_STRUCT GROWABLE_ARRAY_EMPTY_FUNCTION_NAME() { GROWABLE_ARRAY_STRUCT empty = { .data = NULL, .allocated = 0, .size = 0 }; return empty; } void GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME( GROWABLE_ARRAY_STRUCT_POINTER array ) { #ifdef GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME for (size_t i = 0; i < array->size; i++) { GROWABLE_ARRAY_ITEM_CLEANUP_FUNCTION_NAME(array->data[i]); } #endif free(array->data); } void GROWABLE_ARRAY_ADD_FUNCTION_NAME( GROWABLE_ARRAY_STRUCT_POINTER array, GROWABLE_ARRAY_DATA value ) { if (array->size + 1 > array->allocated) { size_t new_allocated; if (array->size == 0) { new_allocated = 2; } else { new_allocated = array->size * 2; } GROWABLE_ARRAY_DATA_POINTER new_data = (GROWABLE_ARRAY_DATA_POINTER) calloc(sizeof(void*), new_allocated); GROWABLE_ARRAY_DATA_POINTER old_data = array->data; // Copy all the old elements to it if (old_data != NULL) { for (size_t i = 0; i < array->size; i++) { new_data[i] = old_data[i]; } } free(old_data); array->data = new_data; array->allocated = new_allocated; } array->data[array->size] = value; array->size++; } ``` All of which still needs to be handled in each specialization. ```c #ifndef GROWABLE_INT_ARRAY_H #define GROWABLE_INT_ARRAY_H #define GROWABLE_ARRAY_STRUCT struct GrowableIntArray #define GROWABLE_ARRAY_STRUCT_POINTER struct GrowableIntArray* #define GROWABLE_ARRAY_DATA int #define GROWABLE_ARRAY_DATA_POINTER int* #define GROWABLE_ARRAY_EMPTY_FUNCTION_NAME growable_int_array_empty #define GROWABLE_ARRAY_ADD_FUNCTION_NAME growable_int_array_add #define GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME growable_int_array_cleanup #include "growable_array.h" #undef GROWABLE_ARRAY_STRUCT #undef GROWABLE_ARRAY_DATA_POINTER #undef GROWABLE_ARRAY_DATA #undef GROWABLE_ARRAY_DATA_POINTER #undef GROWABLE_ARRAY_EMPTY_FUNCTION_NAME #undef GROWABLE_ARRAY_ADD_FUNCTION_NAME #undef GROWABLE_ARRAY_CLEANUP_FUNCTION_NAME #endif ``` While this is all technically less work than making all the logic for a growable array from scratch ten times, It's certainly not pretty. If you've ever had the life lesson of working with `C++` templates, this is the sort of thing that language feature is intended to replace. ```c++ template <typename T> struct GrowableArray { T* data; size_t allocated; size_t size; } ``` If you haven't, don't get too excited. There lie demons also. ## Conclusion And that is basically it. To grow an array you allocate a new array and copy data into it. To be efficient you allocate more memory than you need each time you grow. If you want to make that data structure for more than one specific data type you either need to rely on runtime indirection and pointers or you need to dive into the C preprocessor and make template headers. If you are a student who has a question you can ask below. You can find complete examples of all these approaches in [this GitHub repo](https://github.com/bowbahdoe/c-growable-array). Corrections welcome. ## Corrections ### `realloc` Instead of `calloc` for everything it is more efficient to use `realloc`. When `malloc` and co. give you a chunk of memory that memory might secretly be larger than you requested. If it is, you can avoid having to do much of the work of the allocator. ### Efficient Runtime Generic Growable Arrays One thing that was pointed out to me is that using a `void**` for the runtime is a naive strategy. We can avoid the memory indirection implied by having an array of pointers by storing the byte size of each element in the struct. ```c struct GrowableArray { void* data; size_t allocated; size_t size; void (*cleanup)(void*); size_t element_size; } ``` Then when we allocate data we get `void*` instead of a `void**` for our storage. Functions like `growable_array_get` will still have to return a `void*` as a result, but those can be cast dereferenced. What is important is that the data behind the `void*` will have the ideal memory layout. ### `calloc` behavior Small but important in some contexts point: `calloc` doesn't give you "the zero" for every type. It does fill the memory with `0` bytes, but I have been informed that for `float`s and similar all zero bytes might not be a zero value.Wed, 21 Aug 0024 05:00:00 +0000Just use Postgreshttps://mccue.dev/pages/8-16-24-just-use-postgres This is one part actionable advice, one part question for the audience. Advice: When you are making a new application that requires persistent storage of data, like is the case for most web applications, your default choice should be `Postgres`. ### Why not `sqlite`? `sqlite` is a pretty good database, but its data is stored in a single file. This implies that whatever your application is, it is running on one machine and one machine only. Or at least one shared filesystem. If you are making a desktop or mobile app, that's perfect. If you are making a website it might not be. There are many success stories of using `sqlite` for a website, but they mostly involve people who set up their own servers and infrastructure. Platforms as a service-s like Heroku, Railway, Render, etc. generally expect you to use a database accessed over network boundary. It's not *wrong* to give up some of the benefits of those platforms, but do consider if the benefits of `sqlite` are worth giving up platform provided automatic database backups and the ability to provision more than one application server. [The official documentation](https://www.sqlite.org/whentouse.html) has a good guide with some more specifics. ### Why not `DynamoDB`, `Cassandra`, or `MongoDB`? Wherever Rick Houlihan is, I hope he is having a good day. I watch a lot of conference talks, but his [2018 DynamoDB Deep Dive](https://www.youtube.com/watch?v=HaEPXoXVf2k) might be the one I've watched the most. I know very few of you are going to watch an hour-long talk, but you really should. It's a good one. The thrust of it is that databases that are in the same genre as `DynamoDB` - which includes `Cassandra` and `MongoDB` - are fantastic **if** - and this is a load bearing if: * You know exactly what your app needs to do, up-front * You know exactly what your access patterns will be, up-front * You have a known need to scale to really large sizes of data * You are okay giving up some level of consistency This is because this sort of database is basically a giant distributed hash map. The only operations that work without needing to scan the entire database are lookups by partition key and scans that make use of a sort key. Whatever queries you need to make, you need to encode that knowledge in one of those indexes before you store it. You want to store users and look them up by either first name or last name? Well you best have a sort key that looks like `<FIRST NAME>$<LAST NAME>`. Your access patterns should be baked into how you store your data. If your access patterns change significantly, you might need to reprocess all of your data. It's annoying because, especially with `MongoDB`, people come into it having been sold on it being a more "flexible" database. Yes, you don't need to give it a schema. Yes, you can just dump untyped JSON into collections. No, this is not a flexible kind of database. It is an efficient one. With a relational database you can go from getting all the pets of a person to getting all the owners of a pet by slapping an index or two on your tables. With this genre of NoSQL, that can be a tall order. Its also not amazing if you need to run analytics queries. Arbitrary questions like "How many users signed up in the last month" can be trivially answered by writing a SQL query, perhaps on a read-replica if you are worried about running an expensive query on the same machine that is dealing with customer traffic. It's just outside the scope of this kind of database. You need to be ETL-ing your data out to handle it. If you see a college student or fresh grad using `MongoDB` stop them. They need help. They have been led astray. ### Why not `Valkey`? The artist formerly known as `Redis` is best known for being an efficient out-of-process cache. You compute something expensive once and slap it in `Valkey` so all 5 or so HTTP servers you have don't need to recompute it. However, you _can_ use it as your primary database. It stores all its data in RAM, so it's pretty fast if you do that. Obvious problems: * You can only have so much RAM. You can have a lot more than you'd think, but its still pretty limited compared to hard drives. * Same as the `DynamoDB`-likes, you need to make concessions on how you model your data. ### Why not `Datomic`? If you already knew about this one, you get a gold star. `Datomic` is a `NoSQL` database, but it is a relational one. The "up-front design" problems aren't there, and it does have some neat properties. You don't store data in tables. It's all "entity-attribute-value-time" (EAVT) pairs. Instead of a person row with `id`, `name`, and `age` you store `1 :person/name "Beth"` and `1 :person/age 30`. Then your queries work off of "universal" indexes. You don't need to coordinate with writers when making queries. You query the database "as-of" a given time. New data, even deletions (or as they call them "retractions"), don't actually delete old data. But there are some significant problems * It only works with JVM languages. * Outside of `Clojure`, a relatively niche language, its API sucks. * If you structure a query badly the error messages you get are terrible. * The whole universe of tools that exist for SQL just aren't there. ### Why not `XTDB`? `Clojure` people make a lot of databases. `XTDB` is spiritually similar do `Datomic` but: * There is an HTTP api, so you aren't locked to the JVM. * It has two axes of time you can query against. "System Time" - when records were inserted - and "Valid Time." * It has a SQL API. The biggest points against it are: * It's new. Its SQL API is something that popped up in the last year. It recently changed its whole storage model. Will the company behind it survive the next 10 years? Who knows! Okay that's just one point. I'm sure I could think of more, but treat this as a stand-in for any recently developed database. The best predictor something will continue to exist into the future is how long it has existed. COBOL been around for decades, it will likely continue to exist for decades. If you have persistent storage, you want as long a support term as you can get. You can certainly choose to pick a newer or experimental database for your app but, regardless of technical properties, that's a risky choice. It shouldn't be your default. ### Why not `Kafka`? `Kafka` is an append only log. It can handle TBs of data. It is a very good append only log. It works amazingly well if you want to do event sourcing type stuff with data flowing in from multiple services maintained by multiple teams of humans. But: * Up to a certain scale, a table in Postgres works perfectly fine as an append only log. * You likely do not have hundreds of people working on your product nor TBs of events flowing in. * Making a Kafka consumer is a bit more error-prone than you'd expect. You need to keep track of your place in the log after all. * Even when maintained by a cloud provider (and there are good managed `Kafka` services) its another piece of infrastructure you need to monitor. ### Why not `ElasticSearch`? Is searching over data the primary function of your product? If yes, `ElasticSearch` is going to give you some real pros. You will need to ETL your data into it and manage that whole process, but `ElasticSearch` is built for searching. It does searching good. If no, `Postgres` will be fine. A sprinkling of `ilike` and the built-in [full text search](https://www.postgresql.org/docs/current/textsearch.html) is more than enough for most applications. You can always bolt on a dedicated search thing later. ### Why not `MSSQL` or `Oracle DB`? Genuine question you should ask yourself: Are these worth the price tag? I don't just mean the straight-up cost to license, but also the cost of lock-in. Once your data is in `Oracle DB` you are going to be paying Oracle forever. You are going to have to train your coders on its idiosyncrasies, forever. You are going to have to decide between enterprise features and your wallet, forever. I know its super unlikely that you will contribute a patch to `Postgres`, so I won't pretend that there is some magic "power of open source" going on, but I think you should have a very specific need in mind to choose a proprietary DB. If you don't have some killer `MSSQL` feature that you simply cannot live without, don't use it. ### Why not `MySQL`? This is the one that I need some audience help with. `MySQL` is owned by Oracle. There are [features locked behind their enterprise editions](https://www.mysql.com/products/enterprise/compare/). To an extent you will have lock-in issues the same as any other DB. But the free edition `MySQL` has also been used in an extremely wide range of things. It's been around for a long time. There are people who know how to work with it. My problem is that I've only spent ~6 months of my professional career working with it. I genuinely don't know enough to compare it intelligently to `Postgres`. I'm convinced it isn't secretly so much better that I am doing folks a disservice when telling them to use `Postgres`, and I do remember reading about how `Postgres` generally has better support for enforcing invariants in the DB itself, but I wouldn't mind being schooled a bit here. ### Why not some AI vector DB? * Most are new. Remember the risks of using something new. * AI is a bubble. A load-bearing bubble, but a bubble. Don't build a house on it if you can avoid it. * Even if your business is another AI grift, you probably only need to `import openai`. ### Why not Google Sheets? You're right. I can't think of any downsides. Go for it. Fri, 16 Aug 0024 05:00:00 +0000I Can't Run My Rust Game Eitherhttps://mccue.dev/pages/8-1-24-i-cant-run-my-rust-game Yesterday I [talked about an issue I had updating a Rust project](https://mccue.dev/pages/7-31-24-rust-just-failed-its-test). The time between when that project was working for me and when it was not was only a few months. But I have one other Rust project I haven't touched in a while. [This little game](https://github.com/bowbahdoe/alien_game). In it, you play as a Pong paddle catching or dodging bullets from an alien. It was a fun project at the time and a good exercise with Rust. It also does not compile today. ```bash Compiling winit v0.19.5 error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/view.rs:209:9 | 205 | extern fn has_marked_text(this: &Object, _sel: Sel) -> BOOL { | ---- expected `bool` because of return type ... 209 | (marked_text.length() > 0) as i8 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `bool`, found `i8` error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:103:26 | 103 | is_zoomed != 0 | --------- ^ expected `bool`, found integer | | | expected because this is `bool` error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:175:57 | 175 | self.window.setFrame_display_(new_rect, 0); | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:1301:48 | 1301 | window.setFrame_display_(current_rect, 0) | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:1308:48 | 1308 | window.setFrame_display_(current_rect, 0) | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:1325:48 | 1325 | window.setFrame_display_(current_rect, 0) | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:1332:48 | 1332 | window.setFrame_display_(current_rect, 0) | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ For more information about this error, try `rustc --explain E0308`. error: could not compile `winit` (lib) due to 7 previous errors ``` A lot of people suggested locking to an older version of the Rust toolchain. So I tried to set my laptop to have whatever the Rust compiler was the last time I made a commit on that project. ```bash ➜ alien_game git:(master) ✗ rustup toolchain install stable-2020-03-25 info: syncing channel updates for 'stable-2020-03-25-aarch64-apple-darwin' error: no release found for 'stable-2020-03-25' ``` Oh, yeah. ARM Macs weren't a thing in 2020. I no longer have the laptop I used to write this code. I can't find a list of available toolchains online and trying every date is tiresome - maybe I'll script it if all else fails - but there are only two dependencies. ```toml [dependencies] ggez = "0.5" rand = "0.7.3" ``` What if I just lock newer versions of `winit` and `cocoa`? ```toml [dependencies] ggez = "0.5" rand = "0.7.3" winit = "0.30.4" cocoa = "0.25.0" ``` ```bash Compiling winit v0.19.5 error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/view.rs:209:9 | 205 | extern fn has_marked_text(this: &Object, _sel: Sel) -> BOOL { | ---- expected `bool` because of return type ... 209 | (marked_text.length() > 0) as i8 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `bool`, found `i8` error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:103:26 | 103 | is_zoomed != 0 | --------- ^ expected `bool`, found integer | | | expected because this is `bool` error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:175:57 | 175 | self.window.setFrame_display_(new_rect, 0); | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:1301:48 | 1301 | window.setFrame_display_(current_rect, 0) | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:1308:48 | 1308 | window.setFrame_display_(current_rect, 0) | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:1325:48 | 1325 | window.setFrame_display_(current_rect, 0) | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ error[E0308]: mismatched types --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.19.5/src/platform/macos/window.rs:1332:48 | 1332 | window.setFrame_display_(current_rect, 0) | ----------------- ^ expected `bool`, found integer | | | arguments to this method are incorrect | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cocoa-0.18.5/src/appkit.rs:932:15 | 932 | unsafe fn setFrame_display_(self, windowFrame: NSRect, display: BOOL); | ^^^^^^^^^^^^^^^^^ For more information about this error, try `rustc --explain E0308`. error: could not compile `winit` (lib) due to 7 previous errors ``` I guess hell or high water it is bringing in that version of `winit`. Well, what if I upgraded `ggez` to the latest? ```toml [dependencies] ggez = "0.9.3" rand = "0.7.3" ``` ```bash Compiling rustisbetter v0.1.0 (/Users/emccue/Development/alien_game) error[E0432]: unresolved import `ggez::event::quit` --> src/main.rs:2:25 | 2 | use ggez::event::{self, quit, EventHandler, KeyCode, KeyMods}; | ^^^^ no `quit` in `event` error[E0432]: unresolved import `ggez::graphics::Font` --> src/main.rs:4:5 | 4 | use ggez::graphics::Font; | ^^^^^^^^^^^^^^^^^^^^ no `Font` in `graphics` error[E0432]: unresolved import `ggez::nalgebra` --> src/main.rs:6:11 | 6 | use ggez::nalgebra::Point2; | ^^^^^^^^ could not find `nalgebra` in `ggez` error[E0432]: unresolved import `ggez::nalgebra` --> src/alien.rs:3:11 | 3 | use ggez::nalgebra::Point2; | ^^^^^^^^ could not find `nalgebra` in `ggez` error[E0432]: unresolved import `ggez::nalgebra` --> src/bullet.rs:3:11 | 3 | use ggez::nalgebra::Point2; | ^^^^^^^^ could not find `nalgebra` in `ggez` error[E0425]: cannot find function `screen_coordinates` in module `graphics` --> src/main.rs:177:44 | 177 | let screen_coordinates = graphics::screen_coordinates(&ctx); | ^^^^^^^^^^^^^^^^^^ not found in `graphics` error[E0425]: cannot find function `clear` in module `graphics` --> src/main.rs:318:19 | 318 | graphics::clear(ctx, graphics::WHITE); | ^^^^^ not found in `graphics` error[E0425]: cannot find value `WHITE` in module `graphics` --> src/main.rs:318:40 | 318 | graphics::clear(ctx, graphics::WHITE); | ^^^^^ not found in `graphics` error[E0425]: cannot find function `present` in module `graphics` --> src/main.rs:324:19 | 324 | graphics::present(ctx)?; | ^^^^^^^ not found in `graphics` error[E0603]: enum `KeyCode` is private --> src/main.rs:2:45 | 2 | use ggez::event::{self, quit, EventHandler, KeyCode, KeyMods}; | ^^^^^^^ private enum | note: the enum `KeyCode` is defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/event.rs:34:30 | 34 | use crate::input::keyboard::{KeyCode, KeyInput, KeyMods}; | ^^^^^^^ help: import `KeyCode` directly | 2 | use ggez::event::{self, quit, EventHandler, winit::event::VirtualKeyCode, KeyMods}; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error[E0603]: struct `KeyMods` is private --> src/main.rs:2:54 | 2 | use ggez::event::{self, quit, EventHandler, KeyCode, KeyMods}; | ^^^^^^^ private struct | note: the struct `KeyMods` is defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/event.rs:34:49 | 34 | use crate::input::keyboard::{KeyCode, KeyInput, KeyMods}; | ^^^^^^^ help: import `KeyMods` directly | 2 | use ggez::event::{self, quit, EventHandler, KeyCode, ggez::input::keyboard::KeyMods}; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ warning: unused imports: `BlendMode`, `Rect` --> src/main.rs:5:22 | 5 | use ggez::graphics::{BlendMode, DrawParam, Drawable, Rect, Text}; | ^^^^^^^^^ ^^^^ | = note: `#[warn(unused_imports)]` on by default warning: unused import: `GameError` --> src/main.rs:7:19 | 7 | use ggez::{audio, GameError}; | ^^^^^^^^^ warning: unused import: `rand::seq::SliceRandom` --> src/main.rs:9:5 | 9 | use rand::seq::SliceRandom; | ^^^^^^^^^^^^^^^^^^^^^^ warning: unused import: `std::error::Error` --> src/main.rs:12:5 | 12 | use std::error::Error; | ^^^^^^^^^^^^^^^^^ warning: unused import: `std::iter::Peekable` --> src/main.rs:16:5 | 16 | use std::iter::Peekable; | ^^^^^^^^^^^^^^^^^^^ warning: unused import: `std::iter::Peekable` --> src/alien.rs:9:5 | 9 | use std::iter::Peekable; | ^^^^^^^^^^^^^^^^^^^ warning: unused import: `std::ops::Add` --> src/alien.rs:10:5 | 10 | use std::ops::Add; | ^^^^^^^^^^^^^ warning: unnecessary parentheses around pattern --> src/alien.rs:80:13 | 80 | let ((min_x, max_x)) = self.x_movement_range; | ^ ^ | = note: `#[warn(unused_parens)]` on by default help: remove these parentheses | 80 - let ((min_x, max_x)) = self.x_movement_range; 80 + let (min_x, max_x) = self.x_movement_range; | warning: use of deprecated function `ggez::filesystem::open`: Use `ctx.fs.open` instead --> src/main.rs:142:65 | 142 | let data = audio::SoundData::from_read(&mut filesystem::open(ctx, "/Bloop.mp3")?)?; | ^^^^ | = note: `#[warn(deprecated)]` on by default error[E0050]: method `key_down_event` has 5 parameters but the declaration in trait `key_down_event` has 4 --> src/main.rs:329:9 | 329 | / &mut self, 330 | | ctx: &mut Context, 331 | | keycode: KeyCode, 332 | | _keymods: KeyMods, 333 | | _repeat: bool, | |_____________________^ expected 4 parameters, found 5 | = note: `key_down_event` from trait: `fn(&mut Self, &mut ggez::Context, KeyInput, bool) -> Result<(), E>` error[E0050]: method `key_up_event` has 4 parameters but the declaration in trait `key_up_event` has 3 --> src/main.rs:348:21 | 348 | fn key_up_event(&mut self, _ctx: &mut Context, keycode: KeyCode, _keymods: KeyMods) { | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected 3 parameters, found 4 | = note: `key_up_event` from trait: `fn(&mut Self, &mut ggez::Context, KeyInput) -> Result<(), E>` error[E0053]: method `resize_event` has an incompatible type for trait --> src/main.rs:358:76 | 358 | fn resize_event(&mut self, _ctx: &mut Context, width: f32, height: f32) { | ^ expected `Result<(), GameError>`, found `()` | = note: expected signature `fn(&mut Game, &mut ggez::Context, _, _) -> Result<(), GameError>` found signature `fn(&mut Game, &mut ggez::Context, _, _)` help: change the output type to match the trait | 358 | fn resize_event(&mut self, _ctx: &mut Context, width: f32, height: f32) -> Result<(), GameError> { | ++++++++++++++++++++++++ error[E0308]: mismatched types --> src/alien.rs:141:13 | 140 | sprite.draw( | ---- arguments to this method are incorrect 141 | ctx, | ^^^ expected `&mut Canvas`, found `&mut Context` | = note: expected mutable reference `&mut Canvas` found mutable reference `&mut ggez::Context` note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:293:8 | 293 | fn draw(&self, canvas: &mut Canvas, param: impl Into<DrawParam>); | ^^^^ error[E0308]: mismatched types --> src/alien.rs:140:9 | 134 | pub fn draw(&self, ctx: &mut Context) -> GameResult<()> { | -------------- expected `Result<(), GameError>` because of return type ... 140 | / sprite.draw( 141 | | ctx, 142 | | DrawParam::default() 143 | | .offset(Point2::new(0.5, 0.5)) 144 | | .dest(Point2::new(self.pos.0, self.pos.1)), 145 | | ) | |_________^ expected `Result<(), GameError>`, found `()` | = note: expected enum `Result<(), GameError>` found unit type `()` help: try adding an expression at the end of the block | 145 ~ ); 146 + Ok(()) | error[E0308]: mismatched types --> src/bullet.rs:54:13 | 53 | self.sprite.draw( | ---- arguments to this method are incorrect 54 | ctx, | ^^^ expected `&mut Canvas`, found `&mut Context` | = note: expected mutable reference `&mut Canvas` found mutable reference `&mut ggez::Context` note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:293:8 | 293 | fn draw(&self, canvas: &mut Canvas, param: impl Into<DrawParam>); | ^^^^ error[E0308]: mismatched types --> src/bullet.rs:53:9 | 52 | pub fn draw(&self, ctx: &mut Context) -> GameResult<()> { | -------------- expected `Result<(), GameError>` because of return type 53 | / self.sprite.draw( 54 | | ctx, 55 | | DrawParam::default() 56 | | .offset(Point2::new(0.5, 0.5)) 57 | | .dest(Point2::new(self.pos.0, self.pos.1)) 58 | | .rotation(FRAC_PI_2), 59 | | ) | |_________^ expected `Result<(), GameError>`, found `()` | = note: expected enum `Result<(), GameError>` found unit type `()` help: try adding an expression at the end of the block | 59 ~ ); 60 + Ok(()) | error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/bullet.rs:100:34 | 100 | self.pos.0 - self.sprite.dimensions().w as f32 / 2.0 | ^^^^^^^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:299:8 | 299 | fn dimensions(&self, gfx: &impl Has<GraphicsContext>) -> Option<Rect>; | ^^^^^^^^^^ help: provide the argument | 100 | self.pos.0 - self.sprite.dimensions(/* gfx */).w as f32 / 2.0 | ~~~~~~~~~~~ error[E0609]: no field `w` on type `Option<Rect>` --> src/bullet.rs:100:47 | 100 | self.pos.0 - self.sprite.dimensions().w as f32 / 2.0 | ^ unknown field | help: one of the expressions' fields has a field of the same name | 100 | self.pos.0 - self.sprite.dimensions().unwrap().w as f32 / 2.0 | +++++++++ error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/bullet.rs:104:34 | 104 | self.pos.1 - self.sprite.dimensions().h as f32 / 2.0 | ^^^^^^^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:299:8 | 299 | fn dimensions(&self, gfx: &impl Has<GraphicsContext>) -> Option<Rect>; | ^^^^^^^^^^ help: provide the argument | 104 | self.pos.1 - self.sprite.dimensions(/* gfx */).h as f32 / 2.0 | ~~~~~~~~~~~ error[E0609]: no field `h` on type `Option<Rect>` --> src/bullet.rs:104:47 | 104 | self.pos.1 - self.sprite.dimensions().h as f32 / 2.0 | ^ unknown field | help: one of the expressions' fields has a field of the same name | 104 | self.pos.1 - self.sprite.dimensions().unwrap().h as f32 / 2.0 | +++++++++ error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/bullet.rs:108:21 | 108 | self.sprite.dimensions().w | ^^^^^^^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:299:8 | 299 | fn dimensions(&self, gfx: &impl Has<GraphicsContext>) -> Option<Rect>; | ^^^^^^^^^^ help: provide the argument | 108 | self.sprite.dimensions(/* gfx */).w | ~~~~~~~~~~~ error[E0609]: no field `w` on type `Option<Rect>` --> src/bullet.rs:108:34 | 108 | self.sprite.dimensions().w | ^ unknown field | help: one of the expressions' fields has a field of the same name | 108 | self.sprite.dimensions().unwrap().w | +++++++++ error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/bullet.rs:112:21 | 112 | self.sprite.dimensions().h | ^^^^^^^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:299:8 | 299 | fn dimensions(&self, gfx: &impl Has<GraphicsContext>) -> Option<Rect>; | ^^^^^^^^^^ help: provide the argument | 112 | self.sprite.dimensions(/* gfx */).h | ~~~~~~~~~~~ error[E0609]: no field `h` on type `Option<Rect>` --> src/bullet.rs:112:34 | 112 | self.sprite.dimensions().h | ^ unknown field | help: one of the expressions' fields has a field of the same name | 112 | self.sprite.dimensions().unwrap().h | +++++++++ error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/main.rs:53:34 | 53 | self.pos.0 - self.sprite.dimensions().w as f32 / 2.0 | ^^^^^^^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:299:8 | 299 | fn dimensions(&self, gfx: &impl Has<GraphicsContext>) -> Option<Rect>; | ^^^^^^^^^^ help: provide the argument | 53 | self.pos.0 - self.sprite.dimensions(/* gfx */).w as f32 / 2.0 | ~~~~~~~~~~~ error[E0609]: no field `w` on type `Option<Rect>` --> src/main.rs:53:47 | 53 | self.pos.0 - self.sprite.dimensions().w as f32 / 2.0 | ^ unknown field | help: one of the expressions' fields has a field of the same name | 53 | self.pos.0 - self.sprite.dimensions().unwrap().w as f32 / 2.0 | +++++++++ error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/main.rs:57:34 | 57 | self.pos.1 - self.sprite.dimensions().h as f32 / 2.0 | ^^^^^^^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:299:8 | 299 | fn dimensions(&self, gfx: &impl Has<GraphicsContext>) -> Option<Rect>; | ^^^^^^^^^^ help: provide the argument | 57 | self.pos.1 - self.sprite.dimensions(/* gfx */).h as f32 / 2.0 | ~~~~~~~~~~~ error[E0609]: no field `h` on type `Option<Rect>` --> src/main.rs:57:47 | 57 | self.pos.1 - self.sprite.dimensions().h as f32 / 2.0 | ^ unknown field | help: one of the expressions' fields has a field of the same name | 57 | self.pos.1 - self.sprite.dimensions().unwrap().h as f32 / 2.0 | +++++++++ error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/main.rs:61:21 | 61 | self.sprite.dimensions().w | ^^^^^^^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:299:8 | 299 | fn dimensions(&self, gfx: &impl Has<GraphicsContext>) -> Option<Rect>; | ^^^^^^^^^^ help: provide the argument | 61 | self.sprite.dimensions(/* gfx */).w | ~~~~~~~~~~~ error[E0609]: no field `w` on type `Option<Rect>` --> src/main.rs:61:34 | 61 | self.sprite.dimensions().w | ^ unknown field | help: one of the expressions' fields has a field of the same name | 61 | self.sprite.dimensions().unwrap().w | +++++++++ error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/main.rs:65:21 | 65 | self.sprite.dimensions().h | ^^^^^^^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:299:8 | 299 | fn dimensions(&self, gfx: &impl Has<GraphicsContext>) -> Option<Rect>; | ^^^^^^^^^^ help: provide the argument | 65 | self.sprite.dimensions(/* gfx */).h | ~~~~~~~~~~~ error[E0609]: no field `h` on type `Option<Rect>` --> src/main.rs:65:34 | 65 | self.sprite.dimensions().h | ^ unknown field | help: one of the expressions' fields has a field of the same name | 65 | self.sprite.dimensions().unwrap().h | +++++++++ error[E0624]: associated function `new` is private --> src/main.rs:120:50 | 120 | alien_idle: Rc::new(graphics::Image::new(ctx, "/ENEMY.png")?), | ^^^ private associated function | ::: /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:5 | 161 | / fn new( 162 | | wgpu: &WgpuContext, 163 | | format: ImageFormat, 164 | | width: u32, ... | 167 | | usage: wgpu::TextureUsages, 168 | | ) -> Self { | |_____________- private associated function defined here error[E0061]: this function takes 6 arguments but 2 arguments were supplied --> src/main.rs:120:33 | 120 | alien_idle: Rc::new(graphics::Image::new(ctx, "/ENEMY.png")?), | ^^^^^^^^^^^^^^^^^^^^------------------- | || | | || expected `TextureFormat`, found `&str` | |expected `&WgpuContext`, found `&mut Context` | multiple arguments are missing | = note: expected reference `&WgpuContext` found mutable reference `&mut ggez::Context` note: associated function defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:8 | 161 | fn new( | ^^^ help: provide the arguments | 120 | alien_idle: Rc::new(graphics::Image::new(/* &WgpuContext */, /* wgpu_types::TextureFormat */, /* u32 */, /* u32 */, /* u32 */, /* wgpu_types::TextureUsages */)?), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error[E0277]: the `?` operator can only be applied to values that implement `Try` --> src/main.rs:120:33 | 120 | alien_idle: Rc::new(graphics::Image::new(ctx, "/ENEMY.png")?), | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `Image` | = help: the trait `Try` is not implemented for `Image` error[E0624]: associated function `new` is private --> src/main.rs:121:52 | 121 | alien_firing: Rc::new(graphics::Image::new(ctx, "/ENEMY_FIRING.png")?), | ^^^ private associated function | ::: /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:5 | 161 | / fn new( 162 | | wgpu: &WgpuContext, 163 | | format: ImageFormat, 164 | | width: u32, ... | 167 | | usage: wgpu::TextureUsages, 168 | | ) -> Self { | |_____________- private associated function defined here error[E0061]: this function takes 6 arguments but 2 arguments were supplied --> src/main.rs:121:35 | 121 | alien_firing: Rc::new(graphics::Image::new(ctx, "/ENEMY_FIRING.png")?), | ^^^^^^^^^^^^^^^^^^^^-------------------------- | || | | || expected `TextureFormat`, found `&str` | |expected `&WgpuContext`, found `&mut Context` | multiple arguments are missing | = note: expected reference `&WgpuContext` found mutable reference `&mut ggez::Context` note: associated function defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:8 | 161 | fn new( | ^^^ help: provide the arguments | 121 | alien_firing: Rc::new(graphics::Image::new(/* &WgpuContext */, /* wgpu_types::TextureFormat */, /* u32 */, /* u32 */, /* u32 */, /* wgpu_types::TextureUsages */)?), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error[E0277]: the `?` operator can only be applied to values that implement `Try` --> src/main.rs:121:35 | 121 | alien_firing: Rc::new(graphics::Image::new(ctx, "/ENEMY_FIRING.png")?), | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `Image` | = help: the trait `Try` is not implemented for `Image` error[E0624]: associated function `new` is private --> src/main.rs:122:46 | 122 | player: Rc::new(graphics::Image::new(ctx, "/PLAYER_OLD_2.png")?), | ^^^ private associated function | ::: /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:5 | 161 | / fn new( 162 | | wgpu: &WgpuContext, 163 | | format: ImageFormat, 164 | | width: u32, ... | 167 | | usage: wgpu::TextureUsages, 168 | | ) -> Self { | |_____________- private associated function defined here error[E0061]: this function takes 6 arguments but 2 arguments were supplied --> src/main.rs:122:29 | 122 | player: Rc::new(graphics::Image::new(ctx, "/PLAYER_OLD_2.png")?), | ^^^^^^^^^^^^^^^^^^^^-------------------------- | || | | || expected `TextureFormat`, found `&str` | |expected `&WgpuContext`, found `&mut Context` | multiple arguments are missing | = note: expected reference `&WgpuContext` found mutable reference `&mut ggez::Context` note: associated function defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:8 | 161 | fn new( | ^^^ help: provide the arguments | 122 | player: Rc::new(graphics::Image::new(/* &WgpuContext */, /* wgpu_types::TextureFormat */, /* u32 */, /* u32 */, /* u32 */, /* wgpu_types::TextureUsages */)?), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error[E0277]: the `?` operator can only be applied to values that implement `Try` --> src/main.rs:122:29 | 122 | player: Rc::new(graphics::Image::new(ctx, "/PLAYER_OLD_2.png")?), | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `Image` | = help: the trait `Try` is not implemented for `Image` error[E0624]: associated function `new` is private --> src/main.rs:123:50 | 123 | red_bullet: Rc::new(graphics::Image::new(ctx, "/Red_Missile.png")?), | ^^^ private associated function | ::: /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:5 | 161 | / fn new( 162 | | wgpu: &WgpuContext, 163 | | format: ImageFormat, 164 | | width: u32, ... | 167 | | usage: wgpu::TextureUsages, 168 | | ) -> Self { | |_____________- private associated function defined here error[E0061]: this function takes 6 arguments but 2 arguments were supplied --> src/main.rs:123:33 | 123 | red_bullet: Rc::new(graphics::Image::new(ctx, "/Red_Missile.png")?), | ^^^^^^^^^^^^^^^^^^^^------------------------- | || | | || expected `TextureFormat`, found `&str` | |expected `&WgpuContext`, found `&mut Context` | multiple arguments are missing | = note: expected reference `&WgpuContext` found mutable reference `&mut ggez::Context` note: associated function defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:8 | 161 | fn new( | ^^^ help: provide the arguments | 123 | red_bullet: Rc::new(graphics::Image::new(/* &WgpuContext */, /* wgpu_types::TextureFormat */, /* u32 */, /* u32 */, /* u32 */, /* wgpu_types::TextureUsages */)?), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error[E0277]: the `?` operator can only be applied to values that implement `Try` --> src/main.rs:123:33 | 123 | red_bullet: Rc::new(graphics::Image::new(ctx, "/Red_Missile.png")?), | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `Image` | = help: the trait `Try` is not implemented for `Image` error[E0624]: associated function `new` is private --> src/main.rs:124:52 | 124 | green_bullet: Rc::new(graphics::Image::new(ctx, "/MISSILE_FIRED.png")?), | ^^^ private associated function | ::: /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:5 | 161 | / fn new( 162 | | wgpu: &WgpuContext, 163 | | format: ImageFormat, 164 | | width: u32, ... | 167 | | usage: wgpu::TextureUsages, 168 | | ) -> Self { | |_____________- private associated function defined here error[E0061]: this function takes 6 arguments but 2 arguments were supplied --> src/main.rs:124:35 | 124 | green_bullet: Rc::new(graphics::Image::new(ctx, "/MISSILE_FIRED.png")?), | ^^^^^^^^^^^^^^^^^^^^--------------------------- | || | | || expected `TextureFormat`, found `&str` | |expected `&WgpuContext`, found `&mut Context` | multiple arguments are missing | = note: expected reference `&WgpuContext` found mutable reference `&mut ggez::Context` note: associated function defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:8 | 161 | fn new( | ^^^ help: provide the arguments | 124 | green_bullet: Rc::new(graphics::Image::new(/* &WgpuContext */, /* wgpu_types::TextureFormat */, /* u32 */, /* u32 */, /* u32 */, /* wgpu_types::TextureUsages */)?), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error[E0277]: the `?` operator can only be applied to values that implement `Try` --> src/main.rs:124:35 | 124 | green_bullet: Rc::new(graphics::Image::new(ctx, "/MISSILE_FIRED.png")?), | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `Image` | = help: the trait `Try` is not implemented for `Image` error[E0624]: associated function `new` is private --> src/main.rs:125:50 | 125 | background: Rc::new(graphics::Image::new(ctx, "/Space.png")?), | ^^^ private associated function | ::: /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:5 | 161 | / fn new( 162 | | wgpu: &WgpuContext, 163 | | format: ImageFormat, 164 | | width: u32, ... | 167 | | usage: wgpu::TextureUsages, 168 | | ) -> Self { | |_____________- private associated function defined here error[E0061]: this function takes 6 arguments but 2 arguments were supplied --> src/main.rs:125:33 | 125 | background: Rc::new(graphics::Image::new(ctx, "/Space.png")?), | ^^^^^^^^^^^^^^^^^^^^------------------- | || | | || expected `TextureFormat`, found `&str` | |expected `&WgpuContext`, found `&mut Context` | multiple arguments are missing | = note: expected reference `&WgpuContext` found mutable reference `&mut ggez::Context` note: associated function defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/image.rs:161:8 | 161 | fn new( | ^^^ help: provide the arguments | 125 | background: Rc::new(graphics::Image::new(/* &WgpuContext */, /* wgpu_types::TextureFormat */, /* u32 */, /* u32 */, /* u32 */, /* wgpu_types::TextureUsages */)?), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error[E0277]: the `?` operator can only be applied to values that implement `Try` --> src/main.rs:125:33 | 125 | background: Rc::new(graphics::Image::new(ctx, "/Space.png")?), | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `Image` | = help: the trait `Try` is not implemented for `Image` error[E0061]: this method takes 1 argument but 0 arguments were supplied --> src/main.rs:257:30 | 257 | game.audio.bloop.play()?; | ^^^^-- an argument of type `&_` is missing | note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/audio.rs:131:8 | 131 | fn play(&mut self, audio: &impl Has<AudioContext>) -> GameResult { | ^^^^ help: provide the argument | 257 | game.audio.bloop.play(/* audio */)?; | ~~~~~~~~~~~~~ error[E0308]: mismatched types --> src/main.rs:274:34 | 274 | game.sprites.background.draw(ctx, DrawParam::default()) | ---- ^^^ expected `&mut Canvas`, found `&mut Context` | | | arguments to this method are incorrect | = note: expected mutable reference `&mut Canvas` found mutable reference `&mut ggez::Context` note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:293:8 | 293 | fn draw(&self, canvas: &mut Canvas, param: impl Into<DrawParam>); | ^^^^ error[E0308]: mismatched types --> src/main.rs:274:5 | 273 | fn draw_background(ctx: &mut Context, game: &Game) -> GameResult<()> { | -------------- expected `Result<(), GameError>` because of return type 274 | game.sprites.background.draw(ctx, DrawParam::default()) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `Result<(), GameError>`, found `()` | = note: expected enum `Result<(), GameError>` found unit type `()` help: try adding an expression at the end of the block | 274 ~ game.sprites.background.draw(ctx, DrawParam::default()); 275 + Ok(()) | error[E0308]: mismatched types --> src/main.rs:290:9 | 289 | game.sprites.player.draw( | ---- arguments to this method are incorrect 290 | ctx, | ^^^ expected `&mut Canvas`, found `&mut Context` | = note: expected mutable reference `&mut Canvas` found mutable reference `&mut ggez::Context` note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:293:8 | 293 | fn draw(&self, canvas: &mut Canvas, param: impl Into<DrawParam>); | ^^^^ error[E0308]: mismatched types --> src/main.rs:289:5 | 288 | fn draw_player(ctx: &mut Context, game: &Game) -> GameResult<()> { | -------------- expected `Result<(), GameError>` because of return type 289 | / game.sprites.player.draw( 290 | | ctx, 291 | | DrawParam::default() 292 | | .offset(Point2::new(0.5, 0.5)) 293 | | .dest(Point2::new(game.player.pos.0, game.player.pos.1)) 294 | | .rotation(FRAC_PI_2), 295 | | ) | |_____^ expected `Result<(), GameError>`, found `()` | = note: expected enum `Result<(), GameError>` found unit type `()` help: try adding an expression at the end of the block | 295 ~ ); 296 + Ok(()) | error[E0308]: mismatched types --> src/main.rs:301:9 | 300 | text.draw( | ---- arguments to this method are incorrect 301 | ctx, | ^^^ expected `&mut Canvas`, found `&mut Context` | = note: expected mutable reference `&mut Canvas` found mutable reference `&mut ggez::Context` note: method defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/graphics/draw.rs:293:8 | 293 | fn draw(&self, canvas: &mut Canvas, param: impl Into<DrawParam>); | ^^^^ error[E0277]: the `?` operator can only be applied to values that implement `Try` --> src/main.rs:300:5 | 300 | / text.draw( 301 | | ctx, 302 | | DrawParam::default().dest(Point2::new( 303 | | game.screen_size.0 as f32 / 2.0, 304 | | game.screen_size.1 as f32 / 2.0, 305 | | )), 306 | | )?; | |______^ the `?` operator cannot be applied to type `()` | = help: the trait `Try` is not implemented for `()` error[E0277]: the trait bound `&mut Game: EventHandler<_>` is not satisfied --> src/main.rs:375:43 | 375 | event::run(&mut ctx, &mut event_loop, &mut my_game)?; | ---------- ^^^^^^^^^^^^ the trait `EventHandler<_>` is not implemented for `&mut Game` | | | required by a bound introduced by this call | note: required by a bound in `run` --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/event.rs:283:8 | 281 | pub fn run<S: 'static, E>(mut ctx: Context, event_loop: EventLoop<()>, mut state: S) -> ! | --- required by a bound in this function 282 | where 283 | S: EventHandler<E>, | ^^^^^^^^^^^^^^^ required by this bound in `run` help: consider removing the leading `&`-reference | 375 - event::run(&mut ctx, &mut event_loop, &mut my_game)?; 375 + event::run(&mut ctx, &mut event_loop, my_game)?; | error[E0308]: arguments to this function are incorrect --> src/main.rs:375:5 | 375 | event::run(&mut ctx, &mut event_loop, &mut my_game)?; | ^^^^^^^^^^ -------- expected `Context`, found `&mut Context` | note: expected `EventLoop<()>`, found `&mut EventLoop<()>` --> src/main.rs:375:26 | 375 | event::run(&mut ctx, &mut event_loop, &mut my_game)?; | ^^^^^^^^^^^^^^^ = note: expected struct `EventLoop<_>` found mutable reference `&mut EventLoop<_>` note: function defined here --> /Users/emccue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ggez-0.9.3/src/event.rs:281:8 | 281 | pub fn run<S: 'static, E>(mut ctx: Context, event_loop: EventLoop<()>, mut state: S) -> ! | ^^^ help: consider removing the borrow | 375 - event::run(&mut ctx, &mut event_loop, &mut my_game)?; 375 + event::run(ctx, &mut event_loop, &mut my_game)?; | help: consider removing the borrow | 375 - event::run(&mut ctx, &mut event_loop, &mut my_game)?; 375 + event::run(&mut ctx, event_loop, &mut my_game)?; | warning: unreachable call --> src/main.rs:375:5 | 375 | event::run(&mut ctx, &mut event_loop, &mut my_game)?; | ---------------------------------------------------^ | | | unreachable call | any code following this expression is unreachable | = note: `#[warn(unreachable_code)]` on by default warning: unused import: `BulletFactory` --> src/main.rs:26:29 | 26 | use crate::bullet::{Bullet, BulletFactory, BulletFactoryImpl}; | ^^^^^^^^^^^^^ Some errors have detailed explanations: E0050, E0053, E0061, E0277, E0308, E0425, E0432, E0603, E0609... For more information about an error, try `rustc --explain E0050`. warning: `rustisbetter` (bin "rustisbetter") generated 11 warnings error: could not compile `rustisbetter` (bin "rustisbetter") due to 61 previous errors; 11 warnings emitted ``` Guess they really took [ZeroVer](https://0ver.org/) to heart, huh? So as it stands I have no clue how to run this Rust project on my Laptop. 1. If you have a clue, let me know. 2. Why did this Rust project bit-rot? Actually curious. 3. Is this representative of what will happen to any Rust project I make?Thu, 01 Aug 0024 05:00:00 +0000Rust Just Failed an Important Testhttps://mccue.dev/pages/7-31-24-rust-just-failed-its-test I have two Rust projects I maintain. The first is a parser for the [EDN Data Format](https://github.com/bowbahdoe/edn-format). I haven't had to touch that one in a while. Best I can tell it's all still working. The second is a fork of the Rust Playground for [running Java code](https://run.mccue.dev/). I also haven't had to touch that one in a while, but I did today to update the versions of Java available and include updated early access builds. When I did that, despite having not changed any dependencies, I got a build error in CI/CD. Build log is [here](https://github.com/bowbahdoe/run-java-code/actions/runs/10181544029/job/28161965425) if anyone wants to see. ``` Compiling io-lifetimes v1.0.11 Compiling doc-comment v0.3.3 Compiling smallvec v1.10.0 Compiling pin-project v1.1.0 Compiling miniz_oxide v0.6.2 Compiling time v0.3.22 error[E0282]: type annotations needed for `Box<_>` --> /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/time-0.3.22/src/format_description/parse/mod.rs:83:9 | 83 | let items = format_items | ^^^^^ ... 86 | Ok(items.into()) | ---- type must be known at this point | help: consider giving `items` an explicit type, where the placeholders `_` are specified | 83 | let items: Box<_> = format_items | ++++++++ For more information about this error, try `rustc --explain E0282`. ``` And just like that, I've lost trust in Rust's resiliency to bit-rot. It's not that deep an error and I resolved it by pinning a higher version of the time library, but it still sucks. To me, whether code will "just work" into the future is an important property of a language and ecosystem. Maybe I had inflated expectations because of Rust editions, and maybe I'm being too harsh, but I now have this feeling of unease that I didn't before.Wed, 31 Jul 0024 05:00:00 +0000You can run Java like Python nowhttps://mccue.dev/pages/7-27-24-you-can-run-java-like-python-or-ruby This is meant to be a brief PSA for the programming general public. All this is known to the people following Java closely, but I figure most are not. As of Java 22, you can run Java code like you would an interpreted language such as Python, Ruby, JavaScript, etc. This means that ahead-of-time compilation is no longer strictly required. Say you have the following files. `src/Main.java` ```java class Main { public static void main(String[] args) { System.out.println(Example.text()); } } ``` `src/Example.java` ```java class Example { static String text() { return "example"; } } ``` You can directly run this project with `java src/Main.java`. This is very new. The Java ecosystem doesn't yet have an accepted equivalent of `pip` or `npm` that isn't also tied to a build tool. Now that a build tool isn't required I figure that will come around soon enough.<sup><a href="#1">1</a></sup> As a sidenote, `public static void main(String[] args)` and `System.out.println` are also no longer going to be needed. Stay tuned. <p id="1" style="font-size: 14px">1: There are two tools that most closely fit the mould today. The first is <a href="https://get-coursier.io/">Coursier</a>, a tool that has been around in the Scala community for a while. The second is <a href="https://github.com/bowbahdoe/jresolve-cli">jresolve</a>, a tool I produced that has a few bugs and missing features, but that I think could be a better fit with more time and polish.</p> Sat, 27 Jul 0024 05:00:00 +0000After CrowdStrike, Programmers Deserve Consequences.https://mccue.dev/pages/7-20-24-programmers-deserve-consequences An Anesthesiologist can expect a salary of over $300k. This is because putting you to sleep for surgery is actually kinda risky. If they do their job wrong you die. Their salary reflects the fact that they take on much of the liability for that. When a Structural Engineer finishes a design, they sign off on it. If something goes wrong with that structure due to their negligence, and it kills someone, that engineer might be on the hook for manslaughter. Yesterday a friend of mine was stuck in the Hospital all day. Their computer system went down and that led to a delay of care. Delays in care [kill people](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1955366/). All over the world Hospitals, Airlines, Banks, etc. - critical infrastructure - was taken down by a bad patch in some random bit of software. This time it was CrowdStrike but let's be a hundred percent fucking real with ourselves it could have been anything. It's an open secret that the entire software development field is a bit of a clusterfuck. Attempts to impose standards and restrictions largely fail. It is diminishingly rare to finish a project on budget, on time, and without defects. The education software developers receive is often woefully inadequate. The space is flooded with grifters, conpersons, imbeciles, and fanatics. We idolize and pray to emulate success stories like Facebook (a grand machine which reminds me of birthdays and drives teenagers to suicide.) It's just bad, man. Software "Engineers" are never held personally accountable for the effects their actions have on the world. That poor bastard or bastard(s) at CrowdStrike weren't paid anesthesiologist rates and yet their mistake is going to kill a lot of people. I doubt they would have signed off on anything they'd done in the last decade as being "defect-free" and yet that is the standard we rightfully hold other fields to. Something needs to change and I doubt anything other than real, uniformly applied, consequences will make a difference. For a more intelligently spoken, less emotionally driven, take on this watch [the David Sankel talk I embedded below](https://www.youtube.com/watch?v=r_U9YFPWxEE). <iframe width="560" height="315" src="https://www.youtube.com/embed/r_U9YFPWxEE?si=taTljOaqpZxYk-wy" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> ## EDIT To clarify, I am not saying that an individual at the bottom of the chain of decision-making is materially responsible for this outage. Based on the degree to which what was in my head was received as almost the opposite message by so many, I am pretty sure I wrote this poorly. I think [this reddit comment](https://www.reddit.com/r/programming/comments/1e8ipxf/comment/le7mz8w/) did a good job distilling something I wish I got across. > The reason why anethesiologists and structural engineers can take responsibility for their work is because they are legally responsible for the consequences of their actions, specifically of things within their individual control. They are members of regulated, professional credentialing organisations (i.e., only a licensed 'professional engineer' can sign off certain things; only a board-certified anethesiologist can perform on patients.) It has nothing to do with 'respect'. > > Software developers as individuals should not be scapegoated in this Crowdstrike situation specifically because they are not licensed, there are no legal standards to be met for the title or the role, and therefore they are the 'peasants' (as the author calls them) who must do as they are told by the business. And also [this post I wrote](https://www.reddit.com/r/programming/comments/1e8ipxf/comment/le8ef4c/?context=3) and [this one a reply down](https://www.reddit.com/r/programming/comments/1e8ipxf/comment/le8ef4c/?context=3) are at least a little clearer on where I think the _blame_ lies for this particular outage. > I am not and wish I never came so close to implying that in this exact instance we should blame a coder for what was clearly a process issue. > > It's just that even though we all know that not unit testing or performing QA is negligent behavior our field doesn't actually have any codes that are enforced by law. > > The reason I implied that programmers should see consequences isn't because I misunderstood how development works or that CrowdStrike was largely caused by chains of terrible management. It's because without any codes similar to those fields we will never be taken seriously. My thought process was "if it matters, we will make codes. If we make codes then maybe we edge closer to being an actual engineering discipline." > > And seriously watch the video I linked. It did a way less shitty job than I did. --- > Yeah I'm a demonstrably bad communicator. > > I agree with everything you are saying and i think we agree on what they shape of things should be. > > But I think without actual codes that you can hold someone to there is no basis upon which to punish a company for not following them. > > Skipping past how we get from here to there, in a world where development of critical systems is regulated and folks are licensed as engineers there should be consequences if one of those licensed engineers is negligent. > > But I fucked up hard by just saying programmers deserve consequences. People assumed I meant "yeah let's get the guy who did this!" I really mean "programmers deserve to live in the world where their actions are given weight and recognized as an engineering discipline with consequences for negligence all the way up the chain." Sat, 20 Jul 0024 05:00:00 +0000A Dramatic Reading: I Will Fucking Piledrive You If You Mention AI Againhttps://mccue.dev/pages/7-5-24-a-reading-i-will-fucking-piledrive-you-if-you-mention-ai-again I thoroughly enjoyed reading [this blog post entitled "I Will Fucking Piledrive You If You Mention AI Again"](https://ludic.mataroa.blog/blog/i-will-fucking-piledrive-you-if-you-mention-ai-again/) on the [Lucidity blog.](https://ludic.mataroa.blog/) I choose to hope that it is a sign that professional developers, as a group, have a developed at least a few anti-BS antibodies following the crypto bubble popping. I also think that the approach it takes is probably one of the most socially effective ones. More effective than the "well, maybe there is _some_ application of the technology" tepidness that allowed for crypto scams to flourish unfettered and lure in impressionable new coders. So in the spirit of keeping the "you're not welcome here" message alive in the news cycle, I commissioned a professional voice actor to give a full dramatic reading of that blog post.<sup><a href="#1">1</a></sup> Enjoy. <audio controls src="/pages/7-5-24-I-Will-Fucking-Piledrive-You-If-You Mention-AI-Again.mp3"></audio> If you have a project in need of a voice actor, you can find their portfolio [here](https://ryangaiservo.com/). <p id="1" style="font-size: 14px">1: I obviously do not have any rights to the original blog post so I can't say "you can use the recording for anything," but I need to make clear that you are not allowed to use any aspect of the recording to train a generative AI. Anything else I am able to grant permission for, I do.</p> Fri, 05 Jul 0024 05:00:00 +0000Extension methods make code harder to read, actuallyhttps://mccue.dev/pages/6-22-24-extension-methods-are-harder-to-read I apologize in advance for whatever comment sections form around this. ## What are instance methods? In many languages you can associate functions with a type. ```java class Dog { void bark() { System.out.println("Bark!"); } } ``` The name these are given differs on the language you are talking about and who you are talking to, but we'll go forward calling these "instance methods." Instance methods are defined at the same time as the type is declared. ```java class Dog { // Type declared here void bark() { // Method declared within it System.out.println("Bark!"); } } ``` Instance methods can have access to fields or properties of the type they are associated with that might not be accessible to other code. ```java class Dog { private final String name; Dog(String name) { this.name = name; } void bark() { // name is accessible to this method, but not to outsiders if (name.equals("Scooby")) { System.out.println("Scooby-Dooby-Doo!"); } else { System.out.println("Bark!"); } } } ``` And, in languages with the ability to "extend" types, instance methods might be overloaded by a subtype. ```java class Pomeranian extends Dog { @Override void bark() { System.out.println("bork."); } } ``` Importantly instance methods are also "convenient" to call. Most code editors can catch you after you've written the `.` after `dog` and offer an autocomplete list of "methods you might want to call." ```java void main() { var dog = new Dog("Scooby"); // After "dog.b" you should be able to hit enter and // have "dog.bark()" filled in for you. dog.bark(); } ``` In addition to discovery, this is convenient for a practice known as "chaining." If one method returns an object which can itself have methods called on it you can "chain" another method call on the end. ```java void main() { String name = " Scrappy "; name = name .toLowerCase() .strip() .concat(" dappy doo"); System.out.println(name); } ``` This is widely considered to be aesthetically pleasing and will be the surprise villain of today's story. ## What are extension methods? If you are not the author of a type, but want to write functionality that builds upon the exposed methods and fields of one, you can write code of your own. ```java class DogUtils { private DogUtils() {} static void playFetch(Dog dog) { System.out.println("Throwing stick..."); dog.bark(); System.out.println("Stick retrieved."); } } ``` Calling such a method will generally look different from calling an instance method. ```java void main() { var dog = new Dog("Scooby"); DogUtils.playFetch(dog); } ``` Importantly you need to know where to look for it (in this case that there is `playFetch` in `DogUtils`) and won't get that helpful autocomplete from writing `dog.` Externally defined methods also don't play nicely with method chaining. Whenever you need to call them you probably need to "break the chain." ```java void main() { String name = " SCRAPPY "; name = name.toLowerCase(); name = StringUtils.capitalizeFirstLetter(name); name = name .strip() .concat(" Dappy doo"); System.out.println(name); } ``` This is considered aesthetically displeasing. Extension methods are a language feature that allow someone to make calling these externally defined methods look like calling an instance method. ```java // This is the "manifold" Java superset // http://manifold.systems/docs.html @Extension class DogUtils { private DogUtils() {} static void playFetch(Dog dog) { System.out.println("Throwing stick..."); dog.bark(); System.out.println("Stick retrieved."); } } ``` ```java void main() { var dog = new Dog("Scooby"); dog.playFetch(); // This turns into a call to DogUtils.playFetch } ``` ## Upsides of extension methods Because calling an extension method looks the same as calling an instance method, downstream users of a library can make a suboptimal API more tolerable by adding their own methods. As an example, the Kotlin language uses its extension mechanism to ["add methods" to `java.lang.String`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/replace-first-char.html) that the Kotlin team would prefer existed. This can make code more aesthetically pleasing and enables method chains to go unbroken, which in turn can make code easier to write. ```java void main() { String name = " SCRAPPY "; name = name .toLowerCase() .capitalizeFirstLetter() .strip() .concat(" Dappy doo"); System.out.println(name); } ``` This is often confused with making code easier to read. ## Downsides of extension methods ### 1. They make life harder for library maintainers Java added the `.strip()` method to `String` in Java 11. `.trim()` already existed but it isn't "unicode aware" and won't trim off everything we would consider to be whitespace. As such, it would have been an ideal target for an extension method. ```java @Extension final class StringUtils { private StringUtils() {} static String strip(String s) { // ... } } ``` So if Java had extension methods there would have certainly been code that looks like this out in the world. ```java void main() { String catchphrase = " zoinks "; catchphrase = catchphrase.strip(); System.out.println(catchphrase); } ``` Where every call to `.strip()` was translated to a call to `StringUtils.strip`. Now consider what happens when you go forward in time and the person writing `String` decides to add their own `.strip()` method. If you recompile code that looks like the above does it * A: Fail to compile. The compiler can't decide which one to use, you need to disambiguate somehow. * B: Continue to use the extension method. * C: Switch to using the instance method. All of these options suck. If it fails to compile now library authors need to consider how likely it is that adding a brand-new method is going to break downstream code. This is something that, in the absence of extension methods, is one of the few things that is basically a free action. If it continues to use the extension method that can quickly become a code readability hazard. People form their own internal roladexes of what methods are available on certain types and what they do. If someone sees `.strip()` called on a `String` its not unreasonable for them to expect exactly the behavior of `String#strip`. If the semantics of the `strip` extension method differ from the semantics of the instance method...shit. Library maintainers need to care about this because any method they add that is likely to conflict with an existing extension method can trigger exactly this hazard. If it switches to using the instance method now both library authors and library consumers need to be a lot more cautious when upgrading libraries. Code, as written, could change behavior from something as simple as adding a method. This is worse than failing to compile since at least if the compiler yells at you there is a sign that something is wrong. ### 2. They make code harder to read Welcome to the part that was click-bait. If the invocation of an instance method looks identical to invoking an extension method it is impossible to tell at a glance which is happening. ```java void main() { // Is this an extension method call or an instance method one? String name = " Velma".stripLeading(); } ``` If the language automatically brings all extension methods "into scope" this problem is global to the entire codebase. If someone in some corner of the world adds an extension method that can alter the behavior of code or affect whether a particular line compiles. If the language doesn't, that means you need some sort of import to make the extension methods available. ```java // If I hadn't been using this example the whole time, would // you catch that "captializeFirstLetter" was the extension method? @Extension(StringUtils.class) class Main { void main() { String name = " SCRAPPY "; name = name .toLowerCase() .capitalizeFirstLetter() .strip() .concat(" Dappy doo"); System.out.println(name); } } ``` This is both a worse and similar situation to `*` imports. One line of code at the top of the file is needed for many other lines to be valid code, but there is no way to visually tie the two together. ```java import java.util.*; void main() { var l = new ArrayList<String>(); } ``` The problem is that readability is about the ease of extracting information from text. Both `*` imports and any hypothetical design of extension methods make it harder to read code because they take information that could be written down and accessible and make it implicit. That can be fine, sometimes. We're not in an anti-golf competition or anything. It is valid to trade readability for ease of writing. But we are lying to ourselves and/or others if we say that extension methods make code more readable. What they do is make some code more aesthetically pleasing. Method chains are considered nice to look at. Beauty is just simply a different thing from comprehensibility. ### 3. They aren't that powerful, actually There are more ways than extension methods to magically attach methods to types. One of the ways that is popular in Scala is to use "implicits." Whenever you use a type in a context that it wouldn't otherwise work, Scala can implicitly wrap your type in another one that will make it work. What does that mean? Well, if you had a line of code like this. ```scala val name = "fred".capitalizeFirstLetter() ``` Then the Scala compiler will look for implicit conversions to a class that does have that method. ```scala class EnrichedString(s: String) { def capitalizeFirstLetter: String = { Character.toUpperCase(s.charAt(0)) + s.substring(1, s.length()) } } given Conversion[String, EnrichedString] with def apply(s: String): EnrichedString = EnrichedString(s) val name = "fred".capitalizeFirstLetter println(name) ``` This is more powerful since you aren't just able to magically add a method, you can magically implement an interface. ```scala trait ThingDoer { def doThing: Unit } class EnrichedString(s: String) extends ThingDoer { def doThing: Unit = { println(s"Hello: ${s}") } } given Conversion[String, ThingDoer] with def apply(s: String): ThingDoer = EnrichedString(s) val thingDoer: ThingDoer = "fred" thingDoer.doThing ``` Are the rules for this confusing? Extremely. > Implicit conversions are applied in two situations: > 1. If an expression e is of type S, and S does not conform to the expression’s expected type T. > 2. In a selection e.m with e of type S, if the selector m does not denote a member of S (to support Scala-2-style extension methods). > > In the first case, a conversion c is searched for, which is applicable to e and whose result type conforms to T. Preach, sister. Which is all to say that extension methods are the Weenie Hut Jr. version of implicits. You get all the downsides of context dependent code and pain for library maintainers, but in place of the really cool features (like being external code being able to implement an interface on a type they didn't define) we only get the most vapid benefit. Method chaining. ## Alternatives ### 1. Use a box If you are working in a language which doesn't have extension methods, but you feel in your bones a strong desire to chain methods, try making a box. ```java import java.util.function.Function; record Box<T>(T value) { <R> Box<R> map(Function<? super T, ? extends R> f) { return new Box<>(f.apply(value)); } } ``` If you box up the value you want to chain methods on then calling instance methods will actually look the same as externally defined ones. ```java void main() { String name = " SCRAPPY "; name = new Box<>(name) .map(String::toLowerCase) .map(StringUtils::capitalizeFirstLetter) .map(String::strip) .map(s -> s.concat(" Dappy doo")) .value(); System.out.println(name); } ``` Is this better than the code without chaining? Debatable. I lean towards no, but if "fluent chaining" is the goal, this achieves the goal. And, unlike a full-blown language feature, it doesn't affect the lives of those for whom method chaining is not an emotional priority. ### Extend the type If the author of a type is okay with you extending it and is ready to consider whatever extensions might exist in the wild when they make new versions of a library, they can make their class open to extension. ```java class Dog { void bark() { System.out.println("Bark!"); } } ``` ```java class Dalmatian extends Dog { void playFetch() { System.out.println("Throwing stick..."); dog.bark(); System.out.println("Stick retrieved."); } } ``` Does this have downsides? Yes, most definitely. You cannot subclass `String` and that's maybe 50-60% of why people want extension methods as a feature. But its at least a mechanism that a library maintainer has control on whether they opt into. ### Add a uniform calling syntax Some languages don't have a special syntax for calling methods defined alongside a type. Accordingly, such languages often do not have an equivalent to extension methods. ```elm import String.Extra name: String name " shaggy rodgers " |> String.trim -- Defined alongside String |> String.Extra.toSentenceCase -- Defined by third party ``` So one possible path for a language to take would be to appease the method chaining junkies and add a new way to invoke methods that chains with instance methods. ```java void main() { String name = " SCRAPPY "; name = name .toLowerCase() |[StringUtils::capitalizeFirstLetter] .strip() .concat(" Dappy doo"); System.out.println(name); } ``` This is [one of the proposed directions that JavaScript might take](https://github.com/tc39/proposal-pipeline-operator). It has its downsides as well, but they are different downsides. ### Use default interface methods (or an equivalent) While this doesn't help you add methods to arbitrary types you did not make, you can use interfaces to add methods to things in most languages that have them. ```java import java.util.function.Consumer; interface IterableExtended<T> extends Iterable<T> { default void forEachTwice(Consumer<? super T> consumer) { this.forEach(t -> { consumer.accept(t); consumer.accept(t); }); } } ``` ```java class Eight implements IterableExtended<Integer> { private boolean gotEight = false; public boolean hasNext() { return !gotEight; } public Integer next() { gotEight = true; return 8; } } ``` ```java void main() { var eight = new Eight(); eight.forEachTwice(System.out::println); } ``` This is [a sort of extension method](https://stackoverflow.com/questions/29466427/what-was-the-design-consideration-of-not-allowing-use-site-injection-of-extensio), it just is a technique that only works at the declaration site, not for arbitrary consumers to add. ### Deal with it. ```java void main() { String name = " SCRAPPY "; name = name.toLowerCase(); name = StringUtils.capitalizeFirstLetter(name); name = name .strip() .concat(" Dappy doo"); System.out.println(name); } ``` ## Conclusion It is fine to like extension methods. It is also fine to think they are worth the tradeoffs. What stinks is that people act like there aren't tradeoffs and that they are purely positive. The sort of vapid "why don't they just add extension methods? Idiots." infects discourse and, while I have no illusions anything I write can stop it, I hope that at least some people now understand why a language might choose to not have them.Mon, 24 Jun 0024 05:00:00 +0000Modules Make javac Easy: Part. 2, Dependencies and Testshttps://mccue.dev/pages/5-30-24-module-libs-and-tests This is a follow-up to [this post](https://mccue.dev/pages/5-29-24-module-compilation). The biggest things I left out in the workflow I was describing are how to handle external dependencies and how to run tests. On the one hand, I feel like I understand how those would work today with the tools that exist. On the other, I'm pretty sure it can be done a little better. Try to focus on whether the "shape" of the process feels alright to you and less on the specifics of any particular command. ## Dependencies I wrote [a post on this before](https://mccue.dev/pages/1-11-24-cli-flow), but I made a tool called `jresolve`. It resolves transitive dependencies.<sup><a href="#1">1</a></sup> If you want to get it to follow along you can use this script. ```bash bash < <(curl -s https://raw.githubusercontent.com/bowbahdoe/jresolve-cli/main/install) ``` Or download a `.jar` from [GitHub Releases](https://github.com/bowbahdoe/jresolve-cli/releases/tag/v2024.05.26). You can use `jresolve` to download libraries you want to have into a folder. ```bash jresolve --output-directory libs \ pkg:maven/org.springframework.boot/spring-boot-starter-web@3.3.0 ``` This will include any transitive dependencies of those libraries. ```bash jresolve --print-tree \ pkg:maven/org.springframework.boot/spring-boot-starter-web@3.3.0 ``` ``` org.springframework.boot/spring-boot-starter-web 3.3.0 . org.springframework.boot/spring-boot-starter 3.3.0 . org.springframework.boot/spring-boot 3.3.0 . org.springframework/spring-core 6.1.8 ... ``` The `pkg:maven` string is available at the top of the page for any artifact on [Maven Central's Search](https://central.sonatype.com/artifact/com.google.guava/guava). If the list of dependencies gets too long you can put the dependencies you want in a file, say `libs.txt`. ``` pkg:maven/com.google.guava/guava@33.2.0-jre pkg:maven/commons-codec/commons-codec@1.17.0 ``` Then include that file with an `@` at the end of the command. ```bash jresolve --output-directory libs @libs.txt ``` Which puts all your dependencies in one place, easily addable to the module path. ```bash javac \ -d build/javac \ --module-path libs \ --module-source-path "./*/src" \ --module web.hello ``` ```bash java --module-path libs:build/jar --module web.hello ``` ## Running Tests JUnit has a command line launcher. It's not [perfect](https://github.com/junit-team/junit5/issues/3836) yet and it's not on anything like [SdkMan](https://sdkman.io/), but it is good enough for our purposes. Add the dependencies you need for the command line launcher and for writing tests to your `libs.txt`.<sup><a href="#2">2</a></sup> ``` pkg:maven/org.junit.jupiter/junit-jupiter-api@5.10.2 pkg:maven/org.junit.platform/junit-platform-console@1.10.2 pkg:maven/org.junit.jupiter/junit-jupiter-engine@5.10.2 ``` Make a module for your tests. And make it an `open` module so the test runner can do its magic. ```java open module web.hello.test { requires web.hello; requires org.junit.jupiter.api; } ``` Write a test in this module. ```java import org.junit.jupiter.api.Test; import web.hello.HelloController; import static org.junit.jupiter.api.Assertions.assertEquals; public class HelloControllerTest { @Test public void getHello() { assertEquals( new HelloController().index(), "Greetings from Spring Boot!" ); } } ``` Then you can launch the test runner like any other code. ```bash java \ --module-path libs:build/jar \ --add-modules web.hello.test,web.util.test \ --module org.junit.platform.console \ execute \ --select-module web.hello.test ``` Which is a little long - I have hopes in the future I can write something like the following. ```bash junit \ execute \ --module-path libs:build/jar \ --select-module web.hello.test ``` But the basics are that you launch junit, point it at your code, and run tests. ## Wrap Up While all this is more _work_ than adding a dependency to a `pom.xml` and running `mvn test`, I'm not convinced its more complicated or any less powerful. If anything the fact that doing things this way lets us interact more directly with tools like `javac` makes it feel more flexible. I made a repo with this setup using Spring Boot that you can find [here](https://github.com/bowbahdoe/spring-javac-setup). All the commands you would run are in the `Justfile`. I included all the libraries needed in the repo in case you don't want to install my CLI tool for whatever reason. <p id="1" style="font-size: 14px">1: Its gauche to pitch your own tool. Especially that one which is admittedly incomplete. One alternative is <a href="https://get-coursier.io/">Coursier</a>.</p> <p id="2" style="font-size: 14px">2: I know, I know - dependency scopes. This is a relatively large conversation to have, but with the module path things that aren't also "in the graph" aren't included. Having test dependencies in the same `libs` folder as other dependencies isn't as important as with the class path. Yes, making a docker image with just the dependencies needed for runtime needs scopes / a practice emulating it. I'm working my way around.</p>Thu, 30 May 0024 05:00:00 +0000Modules Make javac Easyhttps://mccue.dev/pages/5-29-24-module-compilation If you use Java modules, using `javac` to compile your code is easy. I figure this wouldn't be known widely - its not that popular for people to use `javac` directly these days - but its interesting. ## Without Modules `javac` compiles any files you list in its invocation. ```bash javac -d build src/Main.java src/Other.java ``` If the other source files are referenced from the ones you listed, you can use `--source-path` and `javac` will find the others. ```bash # Will find src/Other.java so long as Main uses it javac -d build \ --source-path src \ src/Main.java ``` But, if your source files might not directly reference each other, you need to list every file in your project. That turns into something like this. ```bash javac -d build \ $(find . -name "*.java" -type f) ``` Which, while functional, doesn't inspire joy. ## With Modules All of the above methods work, even if you have a `module-info.java`. But, if you lay out your code like this ``` example.mod/ module-info.java example/ mod/ A.java B.java C.java ``` I.E. with a directory that has the same name as the module within it - then `javac` can automatically find and compile your code. ```bash javac \ -d build \ --module-source-path . \ --module example.mod ``` So `--module-source-path` tells it where to find all the code for a module and `--module` tells it what module you want to compile. If you wanted all your code in a `src/` folder you can do that as well. You just need to tweak the `--module-source-path` argument. ``` example.mod/ src/ module-info.java example/ mod/ A.java B.java C.java ``` ```bash javac \ -d build \ --module-source-path "./*/src" \ --module example.mod ``` Where this becomes actually pretty cool is if you have more than one module. Just put all your project's modules on the same level. ``` example.mod/ src/ module-info.java example/ mod/ A.java other.mod/ src/ module-info.java other/ mod/ B.java ``` Now `javac` can compile more than one module at the same time. ```bash javac \ -d build \ --module-source-path "./*/src" \ --module example.mod,other.mod ``` If modules require each other - like if `example.mod` requires `other.mod` - then all modules will be compiled automatically. ## Other Tools Once you've laid out your code like this other tools, like `javadoc`, will also be able to automatically discover code for your modules ```bash javadoc \ -d docs \ --module-source-path "./*/src" \ --module example.mod,other.mod ``` Isn't that neat? ## Wrap Up This leaves off some crucial bits - like how you would get dependencies or run unit tests - but compare it holistically to setting up a multi-module build in Maven. Or Gradle. Or [bld](https://github.com/rife2/bld). Or whatever. At least to me this feels way less painful. Worthy of a closer look. I made a repo with a basic version of this setup [here](https://github.com/bowbahdoe/javac-modules-demo). All the commands you would run are in the `Justfile`. I also threw in making jars + including resources. Wed, 29 May 0024 05:00:00 +0000Getting Started with java.sqlhttps://mccue.dev/pages/1-17-24-java-sql I get a lot of questions based on a very common school assignment. A student is asked to make a desktop GUI app and, as part of that, connect to and work with a locally hosted MySQL database. In this setup, presumably due to the same set of circumstances that leads to someone showing MySQL as an option for a locally hosted database (W.T.H. right?), people are shown some downright dangerously wrong ways of working with SQL. This bit of writing is for me to send as a first message next time this comes up. ## What is `java.sql` `java.sql` is the module that contains the classes needed to connect to SQL databases in Java. We also call this API "JDBC", which stands for Java Database Connectivity. You don't need to do anything special to get access to this, but if you have a `module-info.java` file in your program you will need to add a `requires java.sql;` line to it. ## Install your database drivers. Though the mechanisms you use to work with databases come with Java, the code to connect to the specific database you are using will not. This means [you need to include a dependency](https://mccue.dev/pages/1-11-24-cli-flow). For MySQL, you need to have [the mysql-connector-j library](https://central.sonatype.com/artifact/com.mysql/mysql-connector-j). You should have been shown how to do this by now, but if not reach out. Other DBs: * [Postgresql](https://central.sonatype.com/artifact/org.postgresql/postgresql) * [Sqlite](https://central.sonatype.com/artifact/org.xerial/sqlite-jdbc) ## Get a `DataSource` The first thing you want to do is get an object which implements the [`DataSource`](https://docs.oracle.com/en/java/javase//21/docs/api/java.sql/javax/sql/DataSource.html) interface. A [`DataSource`](https://docs.oracle.com/en/java/javase//21/docs/api/java.sql/javax/sql/DataSource.html) is an object that can give you a connection to a database. The exact way to do this varies from database to database, but for MySQL you need to create a `new MysqlDataSource()`. This is also the step where you should fill in any authentication info like username and password. Also, only create one of these at the top of your program and pass it to everything else. Do not create a [`DataSource`](https://docs.oracle.com/en/java/javase//21/docs/api/java.sql/javax/sql/DataSource.html) every time you want to run a query. For MySQL this is going to be a `MysqlDataSource`. For Postgres start with `PGSimpleDataSource`. For SQLite, `SQLiteDataSource`. The exact `.set*` methods you need to call will be different depending on your db and maybe your deployment situation. ```java import javax.sql.DataSource; import com.mysql.cj.jdbc.MysqlDataSource; class Main { public static void main(String[] args) { MysqlDataSource db = new MysqlDataSource(); db.setPort(3306); db.setUser("username"); db.setPassword("password"); } } ``` ## Get a `Connection` Once you have a [`DataSource`](https://docs.oracle.com/en/java/javase//21/docs/api/java.sql/javax/sql/DataSource.html) you can call the [`getConnection`](https://docs.oracle.com/en/java/javase//21/docs/api/java.sql/javax/sql/DataSource.html#getConnection()) method to get an active connection to the database. ```java import com.mysql.cj.jdbc.MysqlDataSource; import java.sql.Connection; import javax.sql.DataSource; class Main { public static void main(String[] args) { MysqlDataSource db = new MysqlDataSource(); db.setPort(3306); db.setUser("username"); db.setPassword("password"); try (Connection conn = db.getConnection()) { } } } ``` You will notice that I put the connection inside a `try( ... ) {}` thing. This is called a `try-with-resources` and all it does is make sure to call `conn.close()` after the block is exited, even if an exception happens. Since you want to generally close a connection when you are done with it, this is the way to go. The alternative is this, which you might have seen on your teacher's slides and the example code you were given. This hasn't been needed since **2011**. ```java Connection conn = null; try { conn = db.getConnection(); // Code that might crash } finally { if (conn != null) { conn.close(); } } ``` While you *can* re-use connections, I have to ask that you do not store any [`Connection`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/Connection.html) objects in fields. Whenever you need a connection object, get a fresh one from the [`DataSource`](https://docs.oracle.com/en/java/javase//21/docs/api/java.sql/javax/sql/DataSource.html). This might sound inefficient, but trust me its better than the alternatives. ## Create a `PreparedStatement` There are other ways to run queries on your database, but this is the most consistent one. On a connection object you can call a method named `prepareStatement` and give it a `String` containing a SQL Query. This [`PreparedStatement`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/PreparedStatement.html) object also should be set up to automatically close like a [`Connection`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/Connection.html). ```java import com.mysql.cj.jdbc.MysqlDataSource; import java.sql.Connection; import java.sql.PreparedStatement; import javax.sql.DataSource; class Main { public static void main(String[] args) { MysqlDataSource db = new MysqlDataSource(); db.setPort(3306); db.setUser("username"); db.setPassword("password"); try (Connection conn = db.getConnection()) { try (PreparedStatement stmt = conn.prepareStatement( "SELECT 1 as number;" )) { } } } } ``` ## Get a `ResultSet` To execute a SQL query that will give you results, you call [`executeQuery`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/PreparedStatement.html#executeQuery()) on a [`PreparedStatement`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/PreparedStatement.html). This gives you an object called a [`ResultSet`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/ResultSet.html). A [`ResultSet`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/ResultSet.html) represents a "cursor" over all the rows that came as results from your queries. It starts before any rows in the query and each time you call [`next`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/ResultSet.html#next()) it moves to the next row. Once you are at a particular row, you call various `.get*` methods to access the data in that row. ```java import com.mysql.cj.jdbc.MysqlDataSource; import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.ResultSet; import javax.sql.DataSource; class Main { public static void main(String[] args) { MysqlDataSource db = new MysqlDataSource(); db.setPort(3306); db.setUser("username"); db.setPassword("password"); try (Connection conn = db.getConnection()) { try (PreparedStatement stmt = conn.prepareStatement( "SELECT 1 as number;" )) { ResultSet rs = stmt.executeQuery(); rs.next(); System.out.println(rs.getInt("number")); } } } } ``` If you select more than one row, you can use the fact that `rs.next()` returns `false` when there are no more rows to loop through them all. ```java import com.mysql.cj.jdbc.MysqlDataSource; import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.ResultSet; import javax.sql.DataSource; class Main { public static void main(String[] args) { MysqlDataSource db = new MysqlDataSource(); db.setPort(3306); db.setUser("username"); db.setPassword("password"); try (Connection conn = db.getConnection()) { try (PreparedStatement stmt = conn.prepareStatement( "SELECT name FROM person;" )) { ResultSet rs = stmt.executeQuery(); while (rs.next()) { System.out.println(rs.getString("name")); } } } } } ``` And if you are unsure if you will even get one row, you can use that fact in a similar way. ```java import com.mysql.cj.jdbc.MysqlDataSource; import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.ResultSet; import javax.sql.DataSource; class Main { public static void main(String[] args) { MysqlDataSource db = new MysqlDataSource(); db.setPort(3306); db.setUser("username"); db.setPassword("password"); try (Connection conn = db.getConnection()) { try (PreparedStatement stmt = conn.prepareStatement( "SELECT name FROM person WHERE ssn='111111111';" )) { ResultSet rs = stmt.executeQuery(); if (rs.next()) { System.out.println(rs.getString("name")); } else { System.out.println("No matching person"); } } } } } ``` ## Set parameters The queries you want to run will involve data that comes from a user typing stuff into a box. The way to deal with this is not. I repeat not, under any circumstances the following. ```java "SELECT name FROM person WHERE birthday='" + birthday + "'"; ``` This is the root cause of [SQL Injection](https://owasp.org/www-community/attacks/SQL_Injection) and is generally not something you want to ever do. The way to include data in a query is to put a `?` in the places that data should go, then call various `.set*` methods to set the data. You pass them the data and then the `?` you are replacing. These start counting from `1`, which is unique. ```java import com.mysql.cj.jdbc.MysqlDataSource; import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.ResultSet; import javax.sql.DataSource; class Main { public static void main(String[] args) { MysqlDataSource db = new MysqlDataSource(); db.setPort(3306); db.setUser("username"); db.setPassword("password"); try (Connection conn = db.getConnection()) { try (PreparedStatement stmt = conn.prepareStatement( "SELECT name FROM person WHERE birthday=?" )) { stmt.setString(1, "9/9/1999"); ResultSet rs = stmt.executeQuery(); while (rs.next()) { System.out.println(rs.getString("name")); } } } } } ``` ## Use Multi-Line Strings As your queries get bigger, they will probably be going on multiple lines. To do this use three double quotes on either side. ```java """ SELECT name FROM person WHERE birthday=? """ ``` It's important to know this because your maybe very old curriculum will still have examples like ```java "SELECT name FROM person \n" + "WHERE birthday=?" ``` Which can get tedious. ## Execute non-queries To do something that isn't a query, like inserting rows, you use the `.execute()` method instead of `.executeQuery()`. This will not give you a [`ResultSet`](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/ResultSet.html) object. ```java import com.mysql.cj.jdbc.MysqlDataSource; import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.ResultSet; import javax.sql.DataSource; class Main { public static void main(String[] args) { MysqlDataSource db = new MysqlDataSource(); db.setPort(3306); db.setUser("username"); db.setPassword("password"); try (Connection conn = db.getConnection()) { try (PreparedStatement stmt = conn.prepareStatement( """ INSERT INTO person(name, status) VALUES (?, ?) """ )) { stmt.setString(1, "tiny tim"); stmt.setString(2, "not dead"); stmt.execute(); } } } } ``` ## Pool your connections Getting a fresh connection to the database every time you want to make a query is ultimately inefficient. Think of getting a connection like making a phone call. You need to dial, it needs to ring, and the other end needs to pick up. That all takes time. To resolve this we use "Connection Pools." These are `DataSource` implementations which keep some number of connections always active and re-use them between calls to `.getConnection`, *The* library to use for this is called [`HikariCP`](https://central.sonatype.com/artifact/com.zaxxer/HikariCP). ```java import com.mysql.cj.jdbc.MysqlDataSource; import com.zaxxer.hikari.HikariConfig; import com.zaxxer.hikari.HikariDataSource; public class Main { public static void main(String[] args) { MysqlDataSource mysql = new MysqlDataSource(); mysql.setPort(3306); mysql.setUser("username"); mysql.setPassword("password"); HikariConfig config = new HikariConfig(); config.setDataSource(mysql); HikariDataSource db = new HikariDataSource(config); try (var conn = db.getConnection()) { // ... } } } ``` Note that you do not need to pool connections with a database like SQLite. There making a connection isn't like making a phone call, it's like shouting at your cousin in the other room. There's only one cousin and he can hear you.Wed, 17 Jan 0024 05:00:00 +0000The Java Command Line Workflowhttps://mccue.dev/pages/1-11-24-cli-flow A while ago, I released a draft of the [jresolve](https://github.com/bowbahdoe/jresolve-cli) command line tool. Its function is to take a set of root dependency declarations and resolve the full set of transitive dependencies. I'm happy with the API, but there are some things to fix up. But I think `jresolve`'s existence, and why I bothered to make it, only makes sense as a part of a larger story. This is an attempt to tell it. ## The Problem We all use Maven or Gradle. There are other up-and-coming build tools like [bld](https://rife2.com/bld) and some [Ant](https://ant.apache.org/) holdovers from the 2000s, but if you threw darts at Java codebases that is what you would hit. This is a good state of affairs in many ways, but there are downsides. The specific downside I want to focus on is how it affects the way people learn Java. What follows are my own opinions and perception. ## Step 1. When people learn how to code, typically they start with a "Hello, world" program. In the past, this part involved hand-waving away `public static void main(String[] args)`. In some future release of Java [it will be simpler](https://openjdk.org/jeps/445). That's great. I'll talk about how can affect curriculums at some point. But from a tooling perspective, this step is a choice between having them run `java Main.java` on the command line and having them click the "Big Green Run Button" in whatever text editor they installed or online platform they signed up for. ## Step 2. You can actually go pretty far into the language without leaving a single file, but at some point a student needs to have more than one file in their projects. To do this, you again have a (non-exclusive) choice of how to approach it. Either the command line or the Big Green Run Button. If you take the command line route, it [will be](https://openjdk.org/jeps/458) still `java Main.java`, followed by `java src/Main.java` after you guide them to keep their code in a folder. Green button is the green button. ## Step 3. Because it will be relevant to things to come, you at some point want to explain that Java code can be compiled ahead of time to `.class` files. You could point to the directory where the B.G.R.B. put the class files, or you could explain what `javac` is and how to use it. For that you would land on something like this. ``` javac --source-path ./src -d classes src/Main.java java --class-path classes Main ``` So you would have been able to introduce `javac`, the concept of ahead of time compilation, `.class` files, and the `--class-path`. ## Step 4. Once they've made an app [the next thing they'll want](https://www.youtube.com/watch?v=WszAwrlS5GM) is to package it up into a jar. In the 🟢 world, you show them a menu in their editor and what buttons to click. In the CLI, it's a chance to show them how to use the `jar` tool. ``` jar --create --file app.jar --main-class Main -C classes . ``` ## Step 5. This is where things get tricky, because once they know how to build an app it won't be long before they want to make something that requires a dependency. And it is this step where things fall apart. If you are lucky, the dependency they need has no transitive dependencies. You show them how to download a `.jar` file, how to add it to the IDE or where to put it on the `--class-path`, and warn them that they won't be able to get way with that forever. If you aren't, you need Maven or Gradle. That is by far the easiest way to make sure they get their dependencies. It is also easy to justify. Chances are any Java job would use one of those. One problem is that because Maven and Gradle also take over compiling the code, you invalidate their investment in learning how to use `javac` and `jar`. They won't be using either of those from now on. Another is that both are going to throw a lot in their face. Either an entirely new programming language with Gradle or a relatively beefy `pom.xml` with Maven. This is what a "blank" Maven gives you in IntelliJ. It's not *horrible*, but it does have some `public static void main(String[] args)`-like properties. ```xml <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>org.example</groupId> <artifactId>untitled92</artifactId> <version>1.0-SNAPSHOT</version> <properties> <maven.compiler.source>21</maven.compiler.source> <maven.compiler.target>21</maven.compiler.target> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </properties> </project> ``` You could also use something like [`jbang`](https://www.jbang.dev/), which automatically downloads dependencies declared as comments in the code. But this stops being viable if you want those dependencies for things other than running the code as a script, unfortunately. So the point of `jresolve` specifically is to let you stick with the command-line flow and avoid pivoting to build tools until later. You show them how to use it to get dependencies from the internet as well as how to use those dependencies. ``` jresolve \ --output-directory libraries \ pkg:maven/de.gurkenlabs/litiengine@0.8.0 javac \ --module-path libraries \ --add-modules ALL-MODULE-PATH \ --source-path ./src \ -d classes \ src/Main.java java \ --module-path libraries \ --add-modules ALL-MODULE-PATH \ --class-path classes \ Main ``` As an aside, while it requires the unpleasant appearance of `ALL-MODULE-PATH` I would argue that showing early that you should put your external dependencies on the `--module-path` is a good thing. ## Step 6. If you took the path enabled by a tool like `jresolve`, the commands you are asking folks to run likely are getting pretty hard to remember. It is a good time to introduce some "command runner" mechanism. Some way so they only have to say `compile` instead of a long `javac` incantation. For this purpose, I have a liking for [just](https://github.com/casey/just), but shell scripts, makefiles, etc. are all valid. ```just help: just --list clean: rm -rf classes rm -rf libraries install: rm -rf libraries jresolve \ --output-directory libraries \ pkg:maven/de.gurkenlabs/litiengine@0.8.0 compile: rm -rf classes javac \ --module-path libraries \ --add-modules ALL-MODULE-PATH \ --source-path ./src \ -d classes \ src/Main.java run: java \ --module-path libraries \ --add-modules ALL-MODULE-PATH \ --class-path classes \ Main ``` Now that they can do something spiritually like `just compile`, commands no longer need to be produced from memory every time they want to do things with their code. It also instills the notion that software projects are generally built by a set of named and repeatable processes. And if you haven't or its just time for a refresher, you can use this as an opportunity to go a little deeper into the command line and explain tools like `cd` and `rm`. ## Step 7. Now that they know how to use dependencies and can run somewhat involved processes in the CLI, you can show them how to package their code to share with someone who doesn't have Java. If they've made games and other such gui things, then you can show them `jpackage`. Have them make a `jar` with their classes and show them the flags to include their dependencies and make an installer. If you've kept more of a server-y focus, maybe you'd just show them how to copy files to a remote machine, maybe you'd go through something like docker and show them the commands to build images. But at this point they have some idea of how to "ship" their code. ## Step 8. Now that they have all the mechanisms to deliver code from concept to product, they are going to start making big projects. At least some students will have an idea for a game or a website or similar they're going to invest a lot of time in, but also the assignments you are giving will probably require more structure. A thing I've seen in a lot of curriculums is to have everyone do the `M` part of an `MVC` type assignment and swap `M`s with another group and write the `V` and `C` using that. As such, it is as good a time as any to introduce modules. The path of least resistance would be to introduce the multi-module format that `javac` understands. I.E you have a top level directory for each module that has the module's name. ``` some.mod/ src/ module-info.java some/ mod/ ... other.mod/ src/ module-info.java other/ mod/ ... ``` ``` javac \ -d compiled \ --module-path libraries \ --module-source-path "./*/src" \ --module some.mod,other.mod ``` That will require talking about visibility and packages, so it's a good point to also start talking about higher level concepts like encapsulation and library contracts. ## Step 9. Maybe this can be done a bit sooner, but you definitely need to show those goobers how to write unit tests now. The best way to do this is to show them how to use `junit`. ``` java --module-path libraries:compiled \ --add-modules ALL-MODULE-PATH \ org.junit.platform.console.ConsoleLauncher execute --scan-modules ``` Maybe there can be a `junit` executable ready via some mechanism, but either way all the mechanics of even this relatively verbose incantation have been shown. And this is a great point to introduce the practice of having a separate `test` folder. Also potentially `resources` since tests can use those as a source of test data. ## Step 10. At some point now that they know how to write code, write tests, design modules, etc. It would be a good point to get into library writing. Not everyone will, but some will try their hand. It's at this point that learning Maven or Gradle probably becomes needed, though I think with a smidgen more tooling that can be delayed. Maybe just something to generate a POM + `jreleaser` would be enough. ## Step N. Then, at some point, they have need for a real build tool. I won't opine on this, but I think some people wouldn't ever reach this step. They will have already gotten a relatively deep understanding of the underlying tools, naturally come across the concept of a build task, know what a library is and what Maven coordinates are and do, and they will have enough context to know why Maven would choose `src/main/java` as the place to put code. I think this is a healthier level to engage with build-tools at. Understanding what tasks they automate because you've done those tasks manually. It also gives a firmer foundation for the more exotic parts of tooling like agents, AppCDS, annotation processors, etc. Build tools aren't always the most intuitive with those. ## Conclusion This might not be appropriate for all curriculums. Sometimes you are in a boot-camp and you just gotta be employable with Spring in 6 months. But when the goal of an education isn't optimizing time to employment, I think teaching with the command-line first has value. I just think that in order for it to be actually practical, a few more pieces need to be in place. So that aspiration is what `jresolve` is for. It is what whatever CLI tool I make next will probably be for. That's the vision. Have the JDK be enough, by itself, to get bootstrapped into modern software development. Lower the barrier of entry to be around that of JavaScript and Python. ---- Tell me what I got wrong in the comments below.Thu, 11 Jan 0024 05:00:00 +0000org.xerial.sqlitejdbchttps://mccue.dev/pages/12-24-23-java-library-of-the-day-24 - Maven: [`org.xerial/sqlite-jdbc`](https://central.sonatype.com/artifact/org.xerial/sqlite-jdbc) - Module Name: `org.xerial.sqlitejdbc` - GitHub: [`xerial/sqlite-jdbc`](https://github.com/xerial/sqlite-jdbc) ```mermaid graph TD classDef green stroke:#f00 naming[java.naming] logging[java.logging] xml[java.xml] transaction[java.transaction.xa] slf4j[org.slf4j] sql[java.sql] logging --> sql xml --> sql transaction --> sql sqlrowset[java.sql.rowset] sql --> sqlrowset naming --> sqlrowset sqlite[org.xerial.sqlitejdbc]:::green slf4j --> sqlite sql --> sqlite sqlrowset --> sqlite ``` ## What is it `org.xerial.sqlitejdbc` lets you create and interact with a [SQLite](https://www.sqlite.org/index.html) database from Java. ## Why use it To quote the blurb on [the SQLite website](https://www.sqlite.org/index.html) > SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine. SQLite is the most used database engine in the world. SQLite is built into all mobile phones and most computers and comes bundled inside countless other applications that people use every day. > > The SQLite file format is stable, cross-platform, and backwards compatible and the developers pledge to keep it that way through the year 2050. SQLite database files are commonly used as containers to transfer rich content between systems and as a long-term archival format for data. There are over 1 trillion (1e12) SQLite databases in active use. > > SQLite source code is in the public-domain and is free to everyone to use for any purpose. So if you want a data store, and it's okay if that data is in a file on the filesystem, SQLite is a very good choice. ## Getting Started ```java import org.sqlite.SQLiteDataSource; import java.util.List; void main() throws Exception { var db = new SQLiteDataSource(); db.setUrl("jdbc:sqlite:database.db"); try (var conn = db.getConnection(); var stmt = conn.prepareStatement(""" CREATE TABLE IF NOT EXISTS widget( id integer not null primary key, name text not null ) """)) { stmt.execute(); } try (var conn = db.getConnection()) { for (var name : List.of("Bob", "Susan", "Sob", "Busan")) { try (var stmt = conn.prepareStatement(""" INSERT INTO widget(name) VALUES (?) """)) { stmt.setString(1, name); stmt.execute(); } } } // id=1, name=Bob // id=2, name=Susan // id=3, name=Sob // id=4, name=Busan try (var conn = db.getConnection(); var stmt = conn.prepareStatement(""" SELECT id, name FROM widget """)) { var rs = stmt.executeQuery(); while (rs.next()) { System.out.println( STR."id=\{rs.getInt("id")}, name=\{rs.getString("name")}" ); } } } ```Sun, 24 Dec 0023 05:00:00 +0000de.poiu.apronhttps://mccue.dev/pages/12-23-23-java-library-of-the-day-23 - Maven: [`de.poiu.apron/apron`](https://central.sonatype.com/artifact/de.poiu.apron/apron) - Module Name: `de.poiu.apron` - GitHub: [`hupfdule/apron`](https://github.com/hupfdule/apron) ```mermaid graph TD classDef green stroke:#f00 logging[java.logging] apron[de.poiu.apron]:::green logging ---> apron ``` ## What is it `de.poiu.apron` gives you the ability to read and write [properties](https://docs.oracle.com/javase/8/docs/api/java/util/Properties.html) files while preserving comments, whitespace, and order of entries. ## Why use it [java.util.Properties](https://docs.oracle.com/javase/8/docs/api/java/util/Properties.html) is one of the simpler ways to add configuration to a project. Properties files are just key value pairs separated by an equals sign. ```properties key=value other=otherValue ``` But if you want to edit a properties file programmatically while keeping any formatting, ordering, or commenting that a human did manually you will run into trouble. This is the niche that `de.poiu.apron` fills. You can have configuration files which are updated by a program and a human interchangeably. ## Getting Started ```java import de.poiu.apron.PropertyFile; import de.poiu.apron.entry.PropertyEntry; import java.nio.file.Files; import java.nio.file.Path; void main() throws Exception { var path = Path.of("config.properties"); var fileContents = """ key=value # Context here otherKey=otherValue """; Files.writeString(path, fileContents); PropertyFile file = PropertyFile.from(path.toFile()); // value System.out.println(file.get("key")); file.appendEntry(new PropertyEntry("port", "4031")); file.saveTo(path.toFile()); // key=value // # Context here // otherKey=otherValue // port = 4031 System.out.println(Files.readString(path)); } ```Sat, 23 Dec 0023 05:00:00 +0000com.ethlo.timehttps://mccue.dev/pages/12-22-23-java-library-of-the-day-22 - Maven: [`com.ethlo.time/itu`](https://central.sonatype.com/artifact/com.ethlo.time/itu) - Module Name: `com.ethlo.time` - GitHub: [`ethlo/itu`](https://github.com/ethlo/itu) ```mermaid graph TD classDef green stroke:#f00 slf4j[com.ethlo.time]:::green ``` ## What is it `com.ethlo.time` provides utilities for parsing and producing the date and time formats that you are likely to run into on the internet. Namely, [RFC-3339](https://www.ietf.org/rfc/rfc3339.txt) timestamps and the [W3C date and time Formats](https://www.w3.org/TR/NOTE-datetime). ## Why use it While the `java.time` packages provide generic date and time parsing and can support a wide variety of formats, you still need to know what formats to pick. You also need to pick the same ones everywhere in your program. This streamlines that process for the common case of working with time information you got from, or you want to put out into, the internet. It also is [reportedly faster](https://github.com/ethlo/itu/blob/main/README.md) than the code you would produce using the generic APIs. ## Getting Started ```java import java.time.OffsetDateTime; import com.ethlo.time.DateTime; import com.ethlo.time.ITU; void main() { DateTime dateTime = ITU.parseLenient("2012-12-27T19:07Z"); // 2012-12-27T19:07Z System.out.println(dateTime); OffsetDateTime offsetDateTime = ITU.parseDateTime("2012-12-27T19:07:22.123456789-03:00"); // 2012-12-27T22:07:22Z System.out.println(ITU.formatUtc(offsetDateTime)); // 2012-12-27T22:07:22.123Z System.out.println(ITU.formatUtcMilli(offsetDateTime)); // 2012-12-27T22:07:22.123456Z System.out.println(ITU.formatUtcMicro(offsetDateTime)); // 2012-12-27T22:07:22.123456789Z System.out.println(ITU.formatUtcNano(offsetDateTime)); } ```Fri, 22 Dec 0023 05:00:00 +0000com.fasterxml.uuidhttps://mccue.dev/pages/12-21-23-java-library-of-the-day-21 - Maven: [`com.fasterxml.uuid/java-uuid-generator`](https://central.sonatype.com/artifact/com.fasterxml.uuid/java-uuid-generator) - Module Name: `com.fasterxml.uuid` - GitHub: [`cowtowncoder/java-uuid-generator`](https://github.com/cowtowncoder/java-uuid-generator) ```mermaid graph TD classDef green stroke:#f00 slf4j[org.slf4j] uuid[com.fasterxml.uuid]:::green slf4j --> uuid ``` ## What is it `com.fasterxml.uuid` has methods to generate, and customize the generation of, UUIDs. ## Why use it Most of the time, folks use `UUID.randomUUID()` to get their universally unique identifiers. That makes a UUIDv4. But the world of UUIDs is more varied than that and there are different kinds of UUIDs [that you might want to use](https://generate-uuid.com/which-uuid-version-should-you-use). This includes [UUIDv6](https://uuid6.github.io/uuid6-ietf-draft/) and [UUIDv7](https://uuid6.github.io/uuid6-ietf-draft/), which aren't referenced in the above link. Fun fact though, this library [predates the addition of `UUID.randomUUID()` to the standard library](https://cowtowncoder.medium.com/measuring-performance-of-java-uuid-fromstring-or-lack-thereof-d16a910fa32a). ## Getting Started ```java import com.fasterxml.uuid.Generators; void main() { var uuidv7 = Generators .timeBasedEpochGenerator().generate(); // Version 7 System.out.println(uuidv7); var uuidv5 = Generators .nameBasedGenerator() .generate("string to hash"); System.out.println(uuidv5); } ```Thu, 21 Dec 0023 05:00:00 +0000dev.mccue.microhttp.sessionhttps://mccue.dev/pages/12-20-23-java-library-of-the-day-20 - Maven: [`dev.mccue/microhttp-session`](https://central.sonatype.com/artifact/dev.mccue/microhttp-session) - Module Name: `dev.mccue.microhttp.session` - GitHub: [`bowbahdoe/microhttp-session`](https://github.com/bowbahdoe/microhttp-session) ```mermaid graph TD classDef green stroke:#f00 microhttp[org.microhttp] microhttpsetcookie[dev.mccue.microhttp.setcookie] microhttp --> microhttpsetcookie microhttpcookies[dev.mccue.microhttp.cookies] microhttp --> microhttpcookies microhttp-handler[dev.mccue.microhttp.handler] microhttp --> microhttp-handler async[dev.mccue.async] json[dev.mccue.json] microhttpsession[dev.mccue.microhttp.session]:::green microhttpsetcookie --> microhttpsession microhttpcookies --> microhttpsession microhttp-handler --> microhttpsession async --> microhttpsession json --> microhttpsession ``` ## What is it `dev.mccue.microhttp.session` provides an interface for encoding session data in microhttp responses and decoding session data from microhttp requests. Last one from me for this series, I promise. This just took a lot of build up. ## Why use it If you are making a classical web app, [and maybe you should](https://htmx.org/essays/splitting-your-apis/), then you will want to store persistent data about your users. Most often logins, but other things like [flash data](https://flask.palletsprojects.com/en/2.3.x/patterns/flashing/) are also fair game. This provides a composable interface to that capability. ## Getting Started This example uses [ScopedValue](https://openjdk.org/jeps/429)s so will require preview features. ```java import dev.mccue.json.JsonDecoder; import dev.mccue.microhttp.handler.DelegatingHandler; import dev.mccue.microhttp.handler.RouteHandler; import dev.mccue.microhttp.html.HtmlResponse; import dev.mccue.microhttp.session.ScopedSession; import dev.mccue.microhttp.session.SessionManager; import dev.mccue.microhttp.session.SessionStore; import org.microhttp.EventLoop; import org.microhttp.Options; import java.util.List; import java.util.regex.Pattern; import static dev.mccue.html.Html.HTML; void main() throws Exception { var indexHandler = RouteHandler.of( "GET", Pattern.compile("/"), request -> { var name = ScopedSession.get() .get("name", JsonDecoder::string) .orElse("?"); return new HtmlResponse(HTML.""" <h1> Your name is \{name} </h1> """); } ); var nameHandler = RouteHandler.of( "GET", Pattern.compile("/name/(?<name>.+)"), (matcher, request) -> { ScopedSession.update(data -> data.with("name", matcher.group("name"))); return new HtmlResponse(HTML."Go back to /"); } ); var notFound = new HtmlResponse(404, HTML."Not Found"); var error = new HtmlResponse(500, HTML."Internal Server Error"); // Can also store in encrypted cookies var store = SessionStore.inMemory(); var manager = SessionManager.builder() .store(store) .build(); var rootHandler = ScopedSession.wrap(manager, new DelegatingHandler(List.of(indexHandler, nameHandler), notFound) ); var eventLoop = new EventLoop((request, callback) -> { try { callback.accept(rootHandler.handle(request).intoResponse()); } catch (Exception e) { callback.accept(error.intoResponse()); } }); eventLoop.start(); eventLoop.join(); } ```Wed, 20 Dec 0023 05:00:00 +0000dev.mccue.asynchttps://mccue.dev/pages/12-19-23-java-library-of-the-day-19 - Maven: [`dev.mccue/async`](https://central.sonatype.com/artifact/dev.mccue/async) - Module Name: `dev.mccue.async` - GitHub: [`bowbahdoe/java-async-utils`](https://github.com/bowbahdoe/java-async-utils) ```mermaid graph TD classDef green stroke:#f00 async[dev.mccue.async] ``` ## What is it `dev.mccue.async` provides one class - `Atom`. `Atom` wraps an `AtomicReference` and gives a simpler, if less powerful, API that is geared around atomic compare and swap operations. ## Why use it If you are from the Clojure world, this gives an API directly inspired by its `atom` construct. That can be appealing if you want to have managed immutable state and are used to that world. The primary utility provided is having the atomic compare and swap logic already written out for you. It's only a handful of lines, but not something appealing to copy around a codebase. ## Getting Started ```java import java.util.ArrayList; import dev.mccue.async.Atom; void main() throws Exception { var data = Atom.of(0); // 0 System.out.println(data.get()); data.swap(x -> x + 1); // 1 System.out.println(data.get()); // A bunch of concurrent swaps is sorta a worse // case situation for an atomic reference perf. // wise, but a good illustration of correctness. var threads = new ArrayList<Thread>(); for (int i = 0; i < (10000 - 1); i++) { threads.add( Thread.startVirtualThread(() -> data.swap(x -> x + 1)) ); } for (var thread : threads) { thread.join(); } // 10000 System.out.println(data.get()); } ```Tue, 19 Dec 0023 05:00:00 +0000dev.mccue.microhttp.jsonhttps://mccue.dev/pages/12-18-23-java-library-of-the-day-18 - Maven: [`dev.mccue/microhttp-json`](https://central.sonatype.com/artifact/dev.mccue/microhttp-json) - Module Name: `dev.mccue.microhttp.json` - GitHub: [`bowbahdoe/microhttp-json`](https://github.com/bowbahdoe/microhttp-json) ```mermaid graph TD classDef green stroke:#f00 json[dev.mccue.json] microhttpjson[dev.mccue.microhttp.json]:::green json ---> microhttpjson ``` ## What is it `dev.mccue.microhttp.json` provides `JsonResponse`, a class which implements `IntoResponse` and thus can be used alongside `microhttp` and `microhttp-handler` to produce responses which contain html. It automatically adds the appropriate `Content-Type` header, determines the HTTP reason phrase with `reasonphrase`, and accepts the `Json` type provided by `dev.mccue.json`. ## Why use it If you are using `microhttp` with `microhttp-handler`, it boxes up the logic needed in order to return json responses. This would otherwise be cumbersome to write at every needed location ## Getting Started ```java import dev.mccue.microhttp.handler.RouteHandler; import dev.mccue.microhttp.json.JsonResponse; import java.util.regex.Pattern; import java.util.regex.Matcher; class BasicHandler extends RouteHandler { IndexHandler() { super("GET", Pattern.compile("/")); } @Override public JsonResponse handleRoute( Matcher matcher, Request request ) { return new JsonResponse( Json.objectBuilder() .put("name", "bob") .build() ); } } ``` Mon, 18 Dec 0023 05:00:00 +0000dev.mccue.jsonhttps://mccue.dev/pages/12-17-23-java-library-of-the-day-17 - Maven: [`dev.mccue/json`](https://central.sonatype.com/artifact/dev.mccue/json) - Module Name: `dev.mccue.json` - GitHub: [`bowbahdoe/json`](https://github.com/bowbahdoe/json) ```mermaid graph TD classDef green stroke:#f00 json[dev.mccue.json]:::green ``` ## What is it `dev.mccue.json` provides the ability to read and write JSON data as well as to encode data into JSON and decode data from JSON. ## Why use it Most popular JSON libraries use data-binding. You make a class, possibly annotate it a little bit, and then some automatic logic binds the data inside a JSON structure to the fields of your class. This has clear upsides, but it's hard to explain to newcomers. The underlying mechanism of data-binding is either reflection or compile-time code generation. Both are processes that are hard to "touch." The escape hatches in libraries that assume data-binding is the default mode of operation are less than ergonomic. This takes the other approach, making people manually extract data from JSON, but does so in a way that is composable. You have to write more code, but the mechanics of the code are more plain to see. I've [written about this before](https://mccue.dev/pages/2-26-23-json). I'm listing it here because some libraries I wrote that I'll introduce later depend on it. ## Getting Started ```java import dev.mccue.json.Json; import dev.mccue.json.JsonDecoder; import dev.mccue.json.JsonEncodable; record Superhero(String name) implements JsonEncodable { @Override public Json toJson() { return Json.objectBuilder() .put("name", name) .build(); } public static Superhero fromJson(Json json) { return new Superhero( JsonDecoder.field(json, "name", JsonDecoder::string) ); } } void main() { var superhero = new Superhero("superman"); var json = superhero.toJson(); var jsonStr = Json.writeString(json); var roundTripped = Json.readString(jsonStr); var newSuperhero = Superhero.fromJson(roundTripped); System.out.println(superhero); System.out.println(jsonStr); System.out.println(newSuperhero); } ```Sun, 17 Dec 0023 05:00:00 +0000com.samskivert.jmustachehttps://mccue.dev/pages/12-16-23-java-library-of-the-day-16 - Maven: [`com.samskivert/jmustache`](https://central.sonatype.com/artifact/com.samskivert/jmustache) - Module Name: `com.samskivert.jmustache` - GitHub: [`samskivert/jmustache`](https://github.com/samskivert/jmustache) ```mermaid graph TD classDef green stroke:#f00 jmustache[com.samskivert.jmustache]:::green ``` ## What is it `com.samskivert.jmustache` is an implementation of the [Mustache templating language](https://mustache.github.io/). ## Why use it Mustache is one of many templating languages used for generating HTML. It has the unique advantage of being especially portable between languages and environments. This comes as a result of a deliberate choice to not allow much "logic" in templates. To quote [its man page](https://mustache.github.io/mustache.5.html): > We call it "logic-less" because there are no if statements, else clauses, or for loops. Instead there are only tags. Some tags are replaced with a value, some nothing, and others a series of values. This comes at a cost - you need to arrange all the information for a template up-front - but does make it easier to consider the behavior of a template in isolation. Of the implementations of mustache available for the JVM, `com.samskivert.jmustache` has the fewest moving pieces, is [up-to-date with the specification](https://github.com/mustache/spec), and is [faster](https://github.com/agentgt/template-benchmark) than other implementations that do their work at runtime. ## Getting Started ```java import com.samskivert.mustache.Mustache; import java.util.List; import java.util.Map; record Cartoon(String name, boolean hasMovie) {} void main() { var cartoons = List.of( new Cartoon("Space Ghost Coast to Coast", false), new Cartoon("Harvey Birdman, Attorney at Law", false), new Cartoon("Sealab 2021", false), new Cartoon("The Venture Bros.", true) ); var template = """ <html> <body> <h1> Cartoons </h1> <ul> {{#cartoons}} <li> {{name}} {{#hasMovie}} (there is a movie) {{/hasMovie}} </li> {{/cartoons}} </ul> </body> </html> """; var compiledTemplate = Mustache.compiler() .compile(template); var renderedTemplate = compiledTemplate.execute( Map.of("cartoons", cartoons) ); // <html> // <body> // <h1> Cartoons </h1> // <ul> // <li> // Space Ghost Coast to Coast // </li> // <li> // Harvey Birdman, Attorney at Law // </li> // <li> // Sealab 2021 // </li> // <li> // The Venture Bros. // (there is a movie) // </li> // </ul> // </body> //</html> System.out.println(renderedTemplate); } ```Sat, 16 Dec 0023 05:00:00 +0000com.nulabinc.zxcvbnhttps://mccue.dev/pages/12-15-23-java-library-of-the-day-15 - Maven: [`com.nulab-inc/zxcvbn`](https://central.sonatype.com/artifact/com.nulab-inc/zxcvbn) - Module Name: `com.nulabinc.zxcvbn` - GitHub: [`nulab/zxcvbn4j`](https://github.com/nulab/zxcvbn4j) ```mermaid graph TD classDef green stroke:#f00 zxcvbn[com.nulabinc.zxcvbn]:::green ``` ## What is it `com.nulabinc.zxcvbn`, so named after one of the [100 most common passwords](https://en.wikipedia.org/wiki/Wikipedia:10,000_most_common_passwords), is a password strength estimator. ## Why use it People aren't very good at picking passwords. While it is technically their fault if they make their password `123456` and get their bank account stolen, that can very quickly become your problem. Some services try to mitigate this by asking that passwords have letters, numbers, and "special characters" in them. This doesn't stop things like `P@ssW1rd!`, which will be guessed by password crackers in under a millisecond. `com.nulabinc.zxcvbn` will instead try to figure out how easy it will be for a password cracker to guess the password. This will lead to your users having generally stronger passwords. ## Getting Started ```java import com.nulabinc.zxcvbn.WipeableString; import com.nulabinc.zxcvbn.Zxcvbn; void main() { var zxcvbn = new Zxcvbn(); // Pro-tip, storing passwords in mutable structures lets // you lower the time they are floating around in program // memory. This decreases the window of opportunity for // attackers that might have found a way to poke around // in your process. // // If that sort of attack isn't in your threat model, you // can use regular Strings. var password = new WipeableString("P@ssw0rd!"); var strength = zxcvbn.measure(password); var warning = strength.getFeedback() .getWarning(); // This is similar to a commonly used password. System.out.println(warning); var suggestions = strength.getFeedback() .getSuggestions(); // Add another word or two. Uncommon words are better. // Capitalization doesn't help very much. // Predictable substitutions like '@' instead of 'a' don't help very much. System.out.println(String.join("\n", suggestions)); // fair switch (strength.getScore()) { case 0 -> System.out.println("weak"); case 1 -> System.out.println("fair"); case 2 -> System.out.println("good"); case 3 -> System.out.println("strong"); default -> System.out.println("very strong"); } } ```Fri, 15 Dec 0023 05:00:00 +0000dev.mccue.microhttp.cookieshttps://mccue.dev/pages/12-14-23-java-library-of-the-day-14 - Maven: [`dev.mccue/microhttp-cookies`](https://central.sonatype.com/artifact/dev.mccue/microhttp-cookies) - Module Name: `dev.mccue.microhttp.cookies` - GitHub: [`bowbahdoe/microhttp-cookies`](https://github.com/bowbahdoe/microhttp-cookies) ```mermaid graph TD classDef green stroke:#f00 microhttp[org.microhttp] microhttpcookies[dev.mccue.microhttp.cookies]:::green microhttp --> microhttpcookies ``` ## What is it `dev.mccue.microhttp.cookies` provides a utility parsing cookie headers sent in requests, specifically microhttp's `Request` objects. ## Why use it If you've asked a user to send you a cookie on subsequent requests, such as with `dev.mccue.microhttp.setcookie`, you will most likely want to interpret the data in that cookie when you get it. This library provides the ability to do that. ## Getting Started ```java import dev.mccue.microhttp.cookies.Cookies; import dev.mccue.microhttp.setcookie.SetCookieHeader; import org.microhttp.EventLoop; import org.microhttp.Options; import org.microhttp.Response; import java.util.List; void main() throws Exception { var eventLoop = new EventLoop((request, consumer) -> { var cookies = Cookies.parse(request); var counter = cookies.get("Counter") .orElse("0"); var setCookieHeader = SetCookieHeader.of( "Counter", Integer.toString(Integer.parseInt(counter) + 1) ); consumer.accept( new Response( 200, "OK", List.of(setCookieHeader), counter.getBytes() ) ); }); eventLoop.start(); eventLoop.join(); } ```Thu, 14 Dec 0023 05:00:00 +0000dev.mccue.microhttp.setcookiehttps://mccue.dev/pages/12-13-23-java-library-of-the-day-13 - Maven: [`dev.mccue/microhttp-setcookie`](https://central.sonatype.com/artifact/dev.mccue/microhttp-setcookie) - Module Name: `dev.mccue.microhttp.setcookie` - GitHub: [`bowbahdoe/microhttp-setcookie`](https://github.com/bowbahdoe/microhttp-setcookie) ```mermaid graph TD classDef green stroke:#f00 microhttp[org.microhttp] microhttpsetcookie[dev.mccue.microhttp.setcookie]:::green microhttp --> microhttpsetcookie ``` ## What is it `dev.mccue.microhttp.setcookie` provides a utility for generating a `Set-Cookie` header for use in a microhttp `Response`. ## Why use it Whenever a web browser receives a response from a website, depending on user settings, it will look for any `SetCookie` headers in that response. Data conveyed in those headers will be sent back to the server with every subsequent request. This is one of the easiest ways to have persistent state, like user sessions, on a website. ## Getting Started ```java import org.microhttp.Header; import dev.mccue.microhttp.setcookie.SameSite; import dev.mccue.microhttp.setcookie.SetCookieHeader; void main() { Header header = SetCookieHeader.of("name", "value"); // Header[name=Set-Cookie, value=name=value] System.out.println(header); Header otherHeader = SetCookieHeader.builder("name2", "value2") .sameSite(SameSite.STRICT) .secure(true) .build(); // Header[name=Set-Cookie, value=name2=value2; SameSite=Strict; Secure] System.out.println(otherHeader); } ```Wed, 13 Dec 0023 05:00:00 +0000com.sanctionco.jmailhttps://mccue.dev/pages/12-12-23-java-library-of-the-day-12 - Maven: [`com.sanctionco.jmail/jmail`](https://central.sonatype.com/artifact/com.sanctionco.jmail/jmail) - Module Name: `com.sanctionco.jmail` - GitHub: [`RohanNagar/jmail`](https://github.com/RohanNagar/jmail) ```mermaid graph TD classDef green stroke:#f00 javanaming[java.naming] jmail[com.sanctionco.jmail]:::green javanaming --> jmail ``` ## What is it `com.sanctionco.jmail` parses and validates email addresses. ## Why use it Web applications often need to work with email addresses in some form. If you find yourself needing to check if something is a valid email address, [`jmail` is more correct than the alternatives and generally around twice as fast.](https://www.rohannagar.com/jmail/). If you find yourself wanting to represent an email address in your domain, the `Email` type provided by this library will serve you well. You'd want to use that over a `String` for the same reason you'd want to store a path in a `Path` or an address in a `URI`. ## Getting Started ```java import com.sanctionco.jmail.Email; import com.sanctionco.jmail.JMail; void main() { // false System.out.println(JMail.isValid("gibberish")); // true System.out.println(JMail.isValid("apple@example.com")); Email email = Email.of("apple@example.com") .orElseThrow(); record User(Email email) { } User user = new User(email); // User[email=apple@example.com] System.out.println(user); // example.com System.out.println(user.email().domain()); } ```Tue, 12 Dec 0023 05:00:00 +0000org.apiguardian.apihttps://mccue.dev/pages/12-11-23-java-library-of-the-day-11 - Maven: [`org.apiguardian/apiguardian-api`](https://central.sonatype.com/artifact/org.apiguardian/apiguardian-api) - Module Name: `org.apiguardian.api` - GitHub: [`apiguardian-team/apiguardian`](https://github.com/apiguardian-team/apiguardian) ```mermaid graph TD classDef green stroke:#f00 apiguardian[org.apiguardian.api]:::green ``` ## What is it `org.apiguardian.api` provides an `@API` annotation. This gives a structured place to document API stability guarantees. ## Why use it If you are writing an application, generally no API is truly stable. You are free to change whatever you need to in order to make the software work. Libraries are different. Libraries are used by people whose code you have no control over, but with whom you form an implicit social contract where they trust you to not break their code with new library releases. Explicitly documenting which elements of an API you are committed to maintaining, which you are experimenting with, and which they really shouldn't be touching is therefore a useful thing to do. Annotations will show up prominently in generated documentation, which makes them a good mechanism for documenting these guarantees (or lack there-of). ## Getting Started ```java import org.apiguardian.api.API; public final class MathOps { private MathOps() {} @API(status = API.Status.STABLE) public double pi() { return 3.14; } @API(status = API.Status.EXPERIMENTAL) public double tau() { return pi() * 2; } } ```Mon, 11 Dec 0023 05:00:00 +0000dev.mccue.microhttp-htmlhttps://mccue.dev/pages/12-10-23-java-library-of-the-day-10 - Maven: [`dev.mccue/microhttp-html`](https://central.sonatype.com/artifact/dev.mccue/microhttp-html) - Module Name: `dev.mccue.microhttp-html` - GitHub: [`bowbahdoe/microhttp-html`](https://github.com/bowbahdoe/microhttp-html) ```mermaid graph TD classDef green stroke:#f00 microhttp[org.microhttp] microhttp-handler[dev.mccue.microhttp.handler] microhttp --> microhttp-handler reasonphrase[dev.mccue.reasonphrase] html[dev.mccue.html] microhttp-html[dev.mccue.microhttp.html]:::green microhttp --> microhttp-html microhttp-handler --> microhttp-html reasonphrase --> microhttp-html html --> microhttp-html ``` ## What is it `microhttp-html` provides `HtmlResponse`, a class which implements `IntoResponse` and thus can be used alongside `microhttp` and `microhttp-handler` to produce responses which contain html. It automatically adds the appropriate `Content-Type` header, determines the HTTP reason phrase with `reasonphrase`, and accepts the `Html` type provided by `html`. ## Why use it If you are using `microhttp` with `microhttp-handler`, it boxes up the logic needed in order to return html responses. This would otherwise be cumbersome to write at every needed location ## Getting Started At time of writing template processors are a preview-feature, so you will need to use the latest version of the library and the latest JDK. ```java import dev.mccue.microhttp.handler.RouteHandler; import dev.mccue.microhttp.html.HtmlResponse; import java.util.Properties; import java.util.regex.Pattern; import java.util.regex.Matcher; import static dev.mccue.html.Html.HTML; class IndexHandler extends RouteHandler { IndexHandler() { super("GET", Pattern.compile("/")); } @Override public HtmlResponse handleRoute( Matcher matcher, Request request ) { var name = "bob"; return new HtmlResponse(HTML.""" <html> <body> <h1> Hello \{name} </h1> </body> </html> """); } } ```Sun, 10 Dec 0023 05:00:00 +0000dev.mccue.htmlhttps://mccue.dev/pages/12-9-23-java-library-of-the-day-9 - Maven: [`dev.mccue/html`](https://central.sonatype.com/artifact/dev.mccue/html) - Module Name: `dev.mccue.html` - GitHub: [`bowbahdoe/html`](https://github.com/bowbahdoe/html) ```mermaid graph TD classDef green stroke:#f00 html[dev.mccue.html]:::green ``` ## What is it `html` provides an `Html` type and a template processor which produces `Html` and auto-escapes any embedded values. ## Why use it Before template processors, your options for producing html were to * Keep the HTML in a template - usually, but not always, in a separate file * Generate HTML with a programattic API. The first option is the most widespread, but means that your logic for filling in the template won't be co-located with the contents of the template itself. It also gives a damp and dimly-lit surface for "template languages" to grow. These are often full programming languages in their own right and require special IDE support. The second option isn't as popular because you effectively lose the ability to apply the expertise of designers, who are generally familiar with HTML and are used to seeing page layout expressed in HTML. It also puts a lot of pressure on the programmatic API, since you need to make sure every HTML idiom you need to express is expressible. A difficult feat with an evolving standard. Template processors are a middle ground of sorts. They are a templating language but, because they will be an official part of Java, you can count on IDE support and will be able to collocate them with the code for filling in values. They are a programmatic API but, because you write HTML directly, every idiom is expressible. It should be familiar to most designers as well. You could reasonably draw a parallels between this approach and `JSX` from the JavaScript world. ## Getting Started At time of writing template processors are a preview-feature, so you will need to use the latest version of the library and the latest JDK. ```java import java.util.List; import java.util.ArrayList; import dev.mccue.html.Html; import static dev.mccue.html.Html.HTML; void main() { String name = "joe"; var pets = List.of("snoopy", "Yellow Bird"); var petHtml = new ArrayList<Html>(); for (var pet : pets) { petHtml.add(HTML."<li> \{pet} </li>"); } var page = HTML.""" <html> <body> <h1> Hello \{name} </h1> <ul> \{petHtml} </ul> </body> </html> """; } ```Sat, 09 Dec 0023 05:00:00 +0000org.slf4j.simplehttps://mccue.dev/pages/12-8-23-java-library-of-the-day-8 - Maven: [`org.slf4j/slf4j-simple`](https://central.sonatype.com/artifact/org.slf4j/slf4j-simple) - Module Name: `org.slf4j.simple` - GitHub: [`qos-ch/slf4j`](https://github.com/qos-ch/slf4j) ```mermaid graph TD classDef green stroke:#f00 slf4j[org.slf4j] slf4j-simple[org.slf4j.simple] slf4j --> slf4j-simple:::green ``` ## What is it `slf4j-simple` is a logging implementation for `slf4j-api`. It prints log message emitted at a level of `INFO` or above to `System.err`. ## Why use it If any dependency you have uses `slf4j-api` you will get errors at startup about not having a logging implementation. `slf4j-simple` is not flexible or "powerful" by any definition but, depending on how you deploy your application, it might be all you need. ## Getting Started You need to have both `slf4j-api` and `slf4j-simple` available to your program, then you should see output from logging statements. ```java import org.slf4j.Logger; import org.slf4j.LoggerFactory; void main() { Logger logger = LoggerFactory.getLogger(getClass()); logger.info("Hello World"); } ```Fri, 08 Dec 0023 05:00:00 +0000org.slf4jhttps://mccue.dev/pages/12-7-23-java-library-of-the-day-7 - Maven: [`org.slf4j/slf4j-api`](https://central.sonatype.com/artifact/org.slf4j/slf4j-api) - Module Name: `org.slf4j` - GitHub: [`qos-ch/slf4j`](https://github.com/qos-ch/slf4j) ```mermaid graph TD classDef green stroke:#f00 slf4j[org.slf4j]:::green ``` ## What is it `slf4j-api` - "Simple Logging Facade for Java" - is a logging facade. For Java ## Why use it Logging facades let portions of a larger program emit text based logs without needing to know how those logs will be published. This means external libraries will often emit logs via `slf4j` and expect that the application they are included in to publish them with a logging implementation. `slf4j-api` is the most ubiquitous of these and the winner of the 90s "logging wars." ## Getting Started In order to have the code below emit any output, you need to make sure a logging implementation is included in your project. I'll introduce one of those tomorrow. ```java import org.slf4j.Logger; import org.slf4j.LoggerFactory; void main() { Logger logger = LoggerFactory.getLogger(getClass()); logger.info("Hello World"); } ```Thu, 07 Dec 0023 05:00:00 +0000dev.mccue.microhttp.handlerhttps://mccue.dev/pages/12-6-23-java-library-of-the-day-6 - Maven: [`dev.mccue/microhttp-handler`](https://central.sonatype.com/artifact/dev.mccue/microhttp-handler) - Module Name: `dev.mccue.microhttp.handler` - GitHub: [`bowbahdoe/microhttp-handler`](https://github.com/bowbahdoe/microhttp-handler) ```mermaid graph TD classDef green stroke:#f00 microhttp[org.microhttp] microhttp-handler[dev.mccue.microhttp.handler]:::green microhttp --> microhttp-handler ``` ## What is it `microhttp-handler` provides interfaces for making composable handlers for `microhttp`. There are two interfaces in the module, `IntoResponse` and `Handler`. `IntoResponse` is something which can be converted into a `Response`. `Handler` is a function which takes a `Request` and returns something which implements `IntoResponse`. There are also two implementations of `Handler` provided for convenience - `RouteHandler` and `DelegatingHandler`. `RouteHandler` checks the request's method and uri and returns `null` if it doesn't match a chosen method and regex. `DelegatingHandler` tries a list of handlers in order, returning the first non-null response or falling back to a default for when no match is found. Combined with `RouteHandler`, this can act as a very basic request router. ## Why use it A normal microhttp handler takes a `Request` and a `Consumer<Response>` that should be called later. While this is fine, in the age of virtual threads there isn't much downside to modeling handlers as functions that takes `Request`s and return `Response`s and there are many upsides to doing so. The programming model is simpler to compose, simpler to test, and it provides an opportunity to introduce concepts such as middleware and `IntoResponse` `IntoResponse` is useful because making the normal `Response` record requires dealing with reason phrases, content-type headers, and body encoding. `IntoResponse` provides a seam for custom types to box up much of that logic. ## Getting Started ```java import org.microhttp.EventLoop; import org.microhttp.Response; import org.microhttp.Header; import dev.mccue.microhttp.handler.Handler; import dev.mccue.microhttp.handler.RouteHandler; import dev.mccue.microhttp.handler.DelegatingHandler; import dev.mccue.reasonphrase.ReasonPhrase; import java.util.List; import java.util.regex.Pattern; record TextResponse(int status, String value) implements IntoResponse { @Override public Response intoResponse() { return new Response( status, ReasonPhrase.forStatus(status), List.of(new Header("Content-Type", "text/plain")), value.getBytes() ); } } void main() throws Exception { Handler index = RouteHandler.of( "GET", Pattern.compile("/"), request -> new TextResponse(200, "Hello, world") ); Handler rootHandler = new DelegatingHandler( List.of(index), new TextResponse(404, "Not Found") ); var error = new TextResponse(500, "Internal Error"); var eventLoop = new EventLoop((request, callback) -> { Thread.startVirtualThread(() -> { try { callback.accept( rootHandler.handle(request) .intoResponse() ); } catch (Exception e) { callback.accept(error.intoResponse()); } }); }); eventLoop.start(); eventLoop.join(); } ```Wed, 06 Dec 0023 05:00:00 +0000org.pcollectionshttps://mccue.dev/pages/12-5-23-java-library-of-the-day-5 - Maven: [`org.pcollections/pcollections`](https://central.sonatype.com/artifact/org.pcollections/pcollections) - Module Name: `org.pcollections` - GitHub: [`hrldcpr/pcollections`](https://github.com/hrldcpr/pcollections) ```mermaid graph TD classDef green stroke:#f00 pcollections[org.pcollections]:::green ``` ## What is it `pcollections` provides [Persistent Immutable Collections](https://en.wikipedia.org/wiki/Persistent_data_structure). These are collections which cannot be modified, but use structural sharing to make creating updated versions of themselves efficient. ## Why use it Persistent collections are useful when you want a data aggregate that is immutable but will require multiple updates over the runtime of the program. `pcollections` is unique in the ecosystem of persistent collection libraries in that its types directly extend the [Java Collections Framework](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/doc-files/coll-index.html). Its `PVector` is a subtype of `java.util.List`, its `PMap` is a subtype of `java.util.Map`, and so on. There are definite cons to that - having a `remove` method that does nothing is unideal - but there are also pros. There is no conversion cost when interacting with the numerous APIs that expect `java.util.*` types. ## Getting Started ### Basic Usage ```java import org.pcollections.PVector; import org.pcollections.TreePVector; void main() { var names = TreePVector.empty(); var names2 = names.plus("Mumenstallu"); var names3 = names2.plus("Snufkin"); System.out.println(names); System.out.println(names2); System.out.println(names3); } ``` ### Many updated versions ```java import org.pcollections.PVector; import org.pcollections.TreePVector; void main() { PVector<PVector<Integer>> allVersions = TreePVector.empty(); PVector<Integer> numbers = TreePVector.empty(); for (int i = 0; i < 10000; i++) { numbers = numbers.plus(i); allVersions = allVersions.plus(numbers); } System.out.println(allVersions.size()); // Every version is still valid System.out.println(allVersions.get(2)); System.out.println(allVersions.get(4)); System.out.println(allVersions.get(100)); int total = 0; for (int n : numbers) { total += n; } System.out.println(total); } ```Tue, 05 Dec 0023 05:00:00 +0000com.uwyn.urlencoderhttps://mccue.dev/pages/12-4-23-java-library-of-the-day-4 - Maven: [`dev.mccue/urlparameters`](https://central.sonatype.com/artifact/dev.mccue/urlparameters) - Module Name: `dev.mccue.urlparameters` - GitHub: [`bowbahdoe/urlparameters`](https://github.com/bowbahdoe/urlparameters) ```mermaid graph TD classDef green stroke:#f00 urlencoder[com.uwyn.urlencoder] urlparameters[dev.mccue.urlparameters]:::green urlencoder --> urlparameters ``` ## What is it `urlparameters` provides the logic needed to read and write "URL parameters." This covers both query parameters, often seen at the end of a URL (`google.com?q=apples&track_id=123`), and the bodies of html form submissions (`name=bob&age=98`). It uses `com.uwyn.urlencoder` to properly encode parameters for both situations. ## Why use it Most websites eventually encode some information as query parameters inside a URL. `urlparameters` lets you extract that information as well as produce such URLs. It is also common for websites to accept information from a user via a form submission - more so if you server-side render HTML. Processing those form submissions means parsing those request bodies. ## Getting Started ### Parse Query Params from a URL ```java import java.net.URI; import dev.mccue.urlparameters.UrlParameters; void main() { var url = URI.create("https://google.com?q=pear"); var params = UrlParameters.parse(url); System.out.println(params.firstValue("q").orElseThrow()); } ``` ([Playground Link](https://java-playground.com/?runtime=latest&release=21&preview=enabled&gist=2271404ce30f6b9f029637a3f17667bd)) ### Parse Form Submission bodies ```java import dev.mccue.urlparameters.UrlParameters; void main() { var body = "name=jack&title=squire"; var params = UrlParameters.parse(body); // squire System.out.println(params.firstValue("title").orElseThrow()); } ``` ### Generate a URL with Query Params ```java import java.util.List; import java.net.URI; import dev.mccue.urlparameters.UrlParameters; import dev.mccue.urlparameters.UrlParameter; void main() { var params = new UrlParameters(List.of( new UrlParameter("pokemon", "stantler"), new UrlParameter("caught_in", "Pokemon Colosseum") )); var url = URI.create("https://example.com?" + params); // https://example.com?pokemon=stantler&caught_in=Pokemon%20Colosseum System.out.println(url); } ```Mon, 04 Dec 0023 05:00:00 +0000com.uwyn.urlencoderhttps://mccue.dev/pages/12-3-23-java-library-of-the-day-3 - Maven: [`com.uwyn/urlencoder`](https://central.sonatype.com/artifact/com.uwyn/urlencoder) - Module Name: `com.uwyn.urlencoder` - GitHub: [`gbevin/urlencoder`](https://github.com/gbevin/urlencoder) ```mermaid graph TD classDef green stroke:#f00 urlencoder[com.uwyn.urlencoder]:::green ``` ## What is it `urlencoder` encodes URL components using rules determined by combining the unreserved character set from [RFC 3986](https://www.rfc-editor.org/rfc/rfc3986#page-13) with the percent-encode set from [application/x-www-form-urlencoded](https://url.spec.whatwg.org/#application-x-www-form-urlencoded-percent-encode-set). ## Why use it The built-in [`java.net.URLEncoder`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/net/URLEncoder.html) encodes strings into the "HTML form encoding." This is very slightly different from the form of encoding that should be used for URLs. In the specifications for URIs ([URLs are a subset of URIs](https://www.rfc-editor.org/rfc/rfc3986#section-1.1.3)), spaces are encoded as `%20`. In `application/x-www-form-urlencoded`, spaces are usually encoded as `+`, though `%20` would also be valid. Because [`java.net.URLEncoder`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/net/URLEncoder.html) uses `+` for spaces, [other libraries can fail to decode data properly](https://github.com/gbevin/urlencoder#why-not-simply-use-javaneturlencoder). The [`UrlEncoder`](https://github.com/gbevin/urlencoder/blob/main/src/main/java/com/uwyn/urlencoder/UrlEncoder.java) class in this library uses `%20` for spaces and is also [reportedly more efficient](https://github.com/gbevin/urlencoder/tree/main#url-encoder-for-java) than [`java.net.URLEncoder`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/net/URLEncoder.html). ## Getting Started ```java import com.uwyn.urlencoder.UrlEncoder; void main() { var string = "Hello world"; var encoded = UrlEncoder.encode(string); System.out.println(encoded); // Hello%20world var decoded = UrlEncoder.decode(encoded); System.out.println(decoded); // Hello world } ```Sun, 03 Dec 0023 05:00:00 +0000dev.mccue.reasonphrasehttps://mccue.dev/pages/12-2-23-java-library-of-the-day-2 - Maven: [`dev.mccue/reasonphrase`](https://central.sonatype.com/artifact/dev.mccue/reasonphrase) - Module Name: `dev.mccue.reasonphrase` - GitHub: [`bowbahdoe/reasonphrase`](https://github.com/bowbahdoe/reasonphrase) ```mermaid graph TD classDef green stroke:#f00 reasonphrase[dev.mccue.reasonphrase]:::green ``` ## What is it `reasonphrase` is a library that provides a lookup from an [HTTP Status Code](https://www.rfc-editor.org/rfc/rfc9110.html#name-overview-of-status-codes) (like `200`) to an [HTTP Reason Phrase](https://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html) (like `OK`). Reason phrases are a mostly unused part of the HTTP protocol but, if you need to pick one anyway, you might as well pick a standard one. ## Why use it In most situations, your web server will automatically pick a reason phrase based on the status code of a response. A notable exception to this is Microhttp, which exposes the reason phrase directly in its `Response` record. So if you are using Microhttp, or some other minimal server, then this library will be of use. ## Getting Started ```java import org.microhttp.EventLoop; import org.microhttp.Response; import org.microhttp.Header; import dev.mccue.reasonphrase.ReasonPhrase; void main() throws Exception { var eventLoop = new EventLoop((request, callback) -> { callback.accept(new Response( 200, ReasonPhrase.forStatus(200), List.of(new Header("Content-Type", "text/plain")), "Hello, world".getBytes() )); }); eventLoop.start(); eventLoop.join(); } ```Sat, 02 Dec 0023 05:00:00 +0000org.microhttphttps://mccue.dev/pages/12-1-23-java-library-of-the-day-1 - Maven: [`org.microhttp/microhttp`](https://central.sonatype.com/artifact/org.microhttp/microhttp) - Module Name: `org.microhttp` - GitHub: [`ebarlas/microhttp`](https://github.com/ebarlas/microhttp) ```mermaid graph TD classDef green stroke:#f00 A[org.microhttp]:::green ``` ## What is it Microhttp is an implementation of an HTTP/1.1 server. This means that it can handle things like `GET` and `POST` requests, but not securing a connection with SSL or websocket connections. ## Why use it It is [very fast](https://www.reddit.com/r/java/comments/w7uqcl/latest_version_of_microhttp_an_eventdriven/) and, as a result of eschewing support for most convenience features and other protocols, it has a codebase that can be [reasonably read and understood fully](https://github.com/ebarlas/microhttp/blob/main/src/main/java/org/microhttp/ConnectionEventLoop.java) in a day or two at most. It discretizes requests and responses, which is a problem if you were expecting to handle file uploads or other such tasks directly, but a non-issue if you only intend to send and receive payloads of reasonable size. In order to publish to the wider internet, you will need to have SSL. That Microhttp doesn't handle this natively isn't that much of an issue since most platforms as a service like [Heroku](https://www.heroku.com/), [Railway](https://railway.app/), and [Render](https://render.com/) provide this by default. As will any load balancer or properly configured [`nginix`](https://www.nginx.com/blog/using-free-ssltls-certificates-from-lets-encrypt-with-nginx/). ## Getting Started ```java import org.microhttp.EventLoop; import org.microhttp.Response; import org.microhttp.Header; void main() throws Exception { var eventLoop = new EventLoop((request, callback) -> { callback.accept(new Response( 200, "OK", List.of(new Header("Content-Type", "text/plain")), "Hello, world".getBytes() )); }); eventLoop.start(); eventLoop.join(); } ```Fri, 01 Dec 0023 05:00:00 +0000Better Java Compiler Error Messageshttps://mccue.dev/pages/8-13-23-java-compiler-error-messages This post represents almost a year of work from [Andrew Arnold](https://github.com/aarnold314), [Ataberk Cirikci](https://github.com/AtaberkCirikci), [Noah Jamison](https://github.com/NJamison1), and [Thalia La Pommeray](https://github.com/ThaliaLaPommeray)<sup><a href="#1">1</a></sup>. Every part of this that you agree with, they deserve all the credit for. Every part that you do not can be blamed on me. Also, this is about Java. We'll take a bit of a winding road to get there, but we will get there. ## Background A compiler's job is to take code - usually from text files - and produce some usable artifact from it. For Java that means taking `*.java` files and producing `*.class` files that can be fed into a JVM.<sup><a href="#2">2</a></sup> The first and most important priority of a compiler is to be correct. If the class files produced by the Java compiler do not function in the way specified by the [Java Language Specification](https://docs.oracle.com/javase/specs/) then it would not be a Java compiler. <!-- [It would be a compiler for a different language](https://www.reddit.com/r/java/comments/13mp9g0/comment/jl9yu0n/?utm_source=share&utm_medium=web2x&context=3). --> The second priority of a compiler has historically been to be fast and resource efficient. In the 90s, CPU and RAM were far more scarce resources. If a language couldn't be compiled efficiently then it would be impractical to use. <sup><a href="#3">3</a></sup> What has historically not been a focus, and has seen a renaissance in modern times, is error messages. ### Elm Elm is a very small and very focused language. It is built specifically for making frontend web apps and has a very restricted set of features.<sup><a href="#4">4</a></sup> ```elm import Html main = Html.h1 "Hello, world" ``` It's pretty cool. [Check it out](https://elm-lang.org/try) if you have some time. Due to its relatively small surface area<sup><a href="#5">5</a></sup>, the language designer was [able to dedicate time to the user experience (UX) of its compiler errors](https://elm-lang.org/news/compiler-errors-for-humans). And by [most](https://twitter.com/TartanLlama/status/978285498430119937) [accounts](https://twitter.com/Mastapegs/status/1492173070529871884)<sup><a href="#6">6</a></sup>, that work has paid off. When a programmer makes a mistake with their Elm code, the Elm compiler will - always have a friendly tone - do its best to show the relevant areas of the code - provide a hint as to how to resolve the error Say we took the example from above and tried to use a fictional `h7` tag. ```elm import Html -- There is no h7 tag main = Html.h7 "Hello, world" ``` The error you would get is the following. ``` I cannot find a `Html.h7` variable: 5| Html.h7 "Hello, world" ^^^^^^^ The `Html` module does not expose a `h7` variable. These names seem close though: Html.h1 Html.h2 Html.h3 Html.h4 Hint: Read <https://elm-lang.org/0.19.1/imports> to see how `import` declarations work in Elm. ``` First to note is the personification of the compiler as an entity. When it says "*I* cannot find a variable" it subtly, but importantly, primes the user to think about the compiler as an entity unto itself. Its small stuff like that gives our monkey brains the hooks it needs to anthropomorphize. And that is useful, because the only time a compiler talks to you is when something is wrong. Which would you prefer - "You have cancer" - "I have your test results, and unfortunately you have cancer." The first one is a game over screen in a [FromSoft game](https://www.youtube.com/watch?v=oUKyFPMZ2_g), the second is a human being with some bedside manner.<sup><a href="#7">7</a></sup> Another aspect about this that is cool is that it points to the exact place in the code that is at issue. Not just the line, but specifically the `Html.h7` expression. ```text 5| Html.h7 "Hello, world" ^^^^^^^ ``` > Before you can resolve an error, you need to find the code causing it. Seems pretty obvious. > > With many compilers you get a location like program.x:43:22 that you have to decipher. Where is that file? Which one is the line? Which is the column? Okay, let me scan through my code. You also often get a pretty-printed version of the problematic code, but it looks nothing like the code you wrote. You again need to do a mental transformation to find it. So a lot of time is lost: > > * converting row and column numbers into an actual file position > * converting pretty-printed code onto actual code to verify that position And don't forget that hint! It's a pretty basic analysis<sup><a href="#8">8</a></sup>, but being able to suggest functions that the user might have meant is big. Even if we disregard the exact contents of the hint, that there is a dedicated place to give hints and for users to look for hints is great. It is hard to show in this format<sup><a href="#9">9</a></sup>, but in addition to the layout of the message things like the `^^^^^^^` are colored red to draw our attention. <code> <pre> 5| Html.h7 "Hello, world" <span style="color: red">^^^^^^^</span> </pre> </code> Say we constructed a similar situation in Java. ```java class Html { static Object h1() { return null; } static Object h2() { return null; } static Object h3() { return null; } static Object h4() { return null; } static Object h5() { return null; } static Object h6() { return null; } } public class Main { public static void main(String[] args) { System.out.println(Html.h7()); } } ``` The error message we get is [streets behind](https://www.youtube.com/watch?v=gCktKQKXNWg). ``` /Main.java:24: error: cannot find symbol System.out.println(Html.h7()); ^ symbol: method h7() location: class Html 1 error ``` > It is kind of shocking how much better things get when you focus on the user. I mean, on some level, it is not shocking at all though. Most terminal tools came into existence well before our industry really started focusing on making apps and websites feel great for their users. We all collectively realized that a hard to use app or website is bad for business, but the same lessons have not really percolated down to tools like compilers and build tools yet. Hopefully I have demonstrated that we can do better! ### Rust Rust is a systems programming language. That means that it targets the same use-cases as C and C++ where speed and predictable latency are hard requirements. ```rust struct Position { x: u32, y: u32 } fn main() { let position = Position { x: 0, y: 1 }; println!("x: {}, y: {}", position.x, position.y); } ``` Rust's most famous feature is its borrow checker. This is what lets it compete in ergonomics with languages like Python and Java without automatic garbage collection at runtime.<sup><a href="#10">10</a></sup> It does this by tracking the "lifetime" of individual variables and fields, putting some rules in place for when those lifetimes end, and what to do when they end and the variable is "dropped".<sup><a href="#11">11</a></sup> ```rust struct Position { x: u32, y: u32 } fn main() { // Lifetime of the position starts here let position = Position { x: 0, y: 1 }; println!("x: {}, y: {}", position.x, position.y); // At this point the position variable is no longer "alive" // and all the memory allocated for it will be freed. } ``` The tradeoff here is that the complexity of tracking lifetimes is pushed into the type system. This [and other advanced features](https://practice.rs/generics-traits/const-generics.html) make Rust one of the most complicated languages out there. ["Fighting the Borrow Checker"](https://web.mit.edu/rust-lang_v1.25/arch/amd64_ubuntu1404/share/doc/rust/html/book/first-edition/references-and-borrowing.html) is a very common occurrence. Despite this, [it is overwhelmingly loved](https://survey.stackoverflow.co/2022#technology-most-loved-dreaded-and-wanted) by those who have used it. My hypothesis for why this is the case<sup><a href="#12">12</a></sup> is it is because very early on its development, there [was dedicated focus given to the error messages its compiler produced](https://blog.rust-lang.org/2016/08/10/Shape-of-errors-to-come.html).<sup><a href="#13">13</a></sup> Even though people tend to produce malformed programs far more often<sup><a href="#14">14</a></sup>, the experience of the compiler being "helpful" offset that.<sup><a href="#15">15</a></sup> > With the importance of addressing Rust's learning curve a key theme in the Rust survey we're as motivated as ever to find any confusing or distracting part of the Rust experience and give it a healthy amount of polish. > Errors are one area where we're applying that polish helps us improve the learning curve bit by bit, and we're looking forward to seeing how far we can go. All error messages in Rust have a specific structure. There is a place for saying where an error occurred, why it occurred, what the error was, and potentially a hint as to how to resolve it. For example, this code is malformed because enum variants need to be prefixed with the name of the enum. ```rust enum Ex { A, B } pub fn main() { let ex = A; } ``` The error message that the compiler produces reflects that. ``` error[E0425]: cannot find value `A` in this scope --> src/main.rs:7:14 | 7 | let ex = A; | ^ not found in this scope | help: consider importing this unit variant | 1 | use crate::Ex::A; | For more information about this error, try `rustc --explain E0425`. ``` It says that the problem is that the value `A` could not be found in scope, shows exactly where in the code the problem is, and offers a hint as to how to resolve it. Just like the Elm errors there is a dedicated section for giving hints, the exact place in the code where a problem happens is shown, and the message is written in a friendly tone. ``` error[E0425]: WHAT --> WHERE | 7 | let ex = A; | ^ WHY + (arrow gives implicit WHERE) | help: HINT (can have many) | 1 | HINT | ``` Compare and contrast that to the error you get with similarly malformed Java code. ```java import java.util.List; enum Ex { A, B } public class MyClass { public static void main(String args[]) { var ex = B; } } ``` ``` /MyClass.java:10: error: cannot find symbol var ex = B; ^ symbol: variable B location: class MyClass ``` It still shows where the problem happened, but it offers no assistance for fixing it and uses a fittingly robotic tone.<sup><a href="#16">16</a></sup> ## Scala Scala is another language for the JVM like Java. I won't go that deep into an explanation because I am not qualified to do so, but as part of the work on Scala 3 they [worked to improve their error messages in similar ways as Elm and Rust](https://www.scala-lang.org/blog/2016/10/14/dotty-errors.html) > We’ve looked at how other modern languages like Elm and Rust handle compiler warnings and error messages, and come to realize that Dotty is actually in great shape to provide comprehensive and easy to understand error messages in the same spirit. [That work is still ongoing](https://github.com/lampepfl/dotty/issues/1589#issuecomment-593879892), but the focus was there. That doesn't mean anything by itself, but I choose to take it as social proof that I'm not crazy. ## IDEs It is tempting to say that it doesn't really matter what errors the compiler spits out because IDEs are in a better position to give feedback anyway. To an extent, this makes sense. IDEs like [IntelliJ](https://www.jetbrains.com/idea/download/?section=mac) are able to provide feedback in ways that a compiler cannot. * If something is wrong, they can highlight it in red. * If something is questionable, they can highlight it in yellow. * If the IDE has a suggestion on how to fix something they can show that through other visual cues.<sup><a href="#17">17</a></sup> That's all great, but unfortunately I don't think it is enough. It's easy to forget when my M1 Mac is running [Baldurs Gate 3](https://baldursgate3.game/) at 60 fps, but hardware powerful enough to run an IDE smoothly is a privilege. [IntelliJ](https://www.jetbrains.com/idea/download/?section=mac) cannot run on a chromebook or whatever commodity hardware a [chronically underfunded school system](https://www.cnn.com/2022/09/18/us/school-conditions-2022/index.html#:~:text=The%20school%20funding%20gap%20keeps%20getting%20worse&text=And%20the%20gap%20is%20widening,had%20grown%20by%20%2425%20billion) can afford.<sup><a href="#18">18</a></sup> This is partly why many curriculums use online platforms like [repl.it](https://repl.it) that are hosted remotely or rely on a student's workflow to be through the command line and a basic text editor. In those cases especially, compiler errors are front and center in a student's learning. This is likely going to become more true when the JEPs for [Unnamed Classes, Instance Main Methods](https://openjdk.org/jeps/445), and [Multi-File Source-Code Programs](https://openjdk.org/jeps/8304400) are integrated and the ergonomics of teaching from the command line become more in line with that of other languages. And while I can't demonstrate it in as strong a way, I maintain that the error messages matter when you are using an IDE as well. Not everyone sees the red squiggles, some students actively ignore them, and they will see the original compiler message when they try to run their code regardless. If what they see is vague and unhelpful, that matters.<sup><a href="#19">19</a></sup> ## Research There was, to my knowledge, exactly one overview study done on compiler error messages. ["Compiler Error Messages Considered Unhelpful: The Landscape of Text-Based Programming Error Message Research"](https://www.researchgate.net/publication/339039595_Compiler_Error_Messages_Considered_Unhelpful_The_Landscape_of_Text-Based_Programming_Error_Message_Research). To those unfamiliar, an overview study is research that looks at existing research within a field and draws conclusions from the body of work in totality. You don't get to make claims like "studies consistently show X, Y, and Z" without looking at all the studies. There are a few things from that study I think are worthy of note. First is that there is very little actual research on error messages. > One of our most striking observations was that there was relatively little literature on the effect of programming error messages on students and their learning. Which at the very least makes me feel better about "going with my gut." Everyone has to be. Second is that the research that _does_ exist does not produce any strong conclusions > While there have been many guidelines for programming error message design proposed and implemented, no coherent picture has emerged to answer even simple questions such as: What does a good programming error message look like? But in the summation of the literature there are a few general guidelines that emerged. * Increase Readability * Reduce Cognitive Load * Provide Context * Use a Positive Tone * Show Examples * Show Solutions or Hints * Allow Dynamic Interaction * Provide Scaffolding * Use logical argumentation * Report errors at the right time So while I would *love* to say "I'm right and science agrees with me"<sup><a href="#21">21</a></sup>, the best I can say is that all the properties of Elm and Rust compiler messages that I have noted are at least represented in the list of things that research suggests "might help." That is * Provide Context * Use a Positive Tone * Show Solutions or Hints There are individual studies [like this one](https://www.brettbecker.com/wp-content/uploads/2016/06/Becker-Effective-2016-SIGCSE.pdf) that more directly support my claims, but I've heard of enough horrible [Dr. Oz](https://www.iheart.com/podcast/105-behind-the-bastards-29236323/episode/part-one-dr-oz-why-americas-81426004/) segments like "Chocolate - the new a superfood?!" to know that it's disingenuous to use single studies like that. So yeah, best I can say is "I am not obviously wrong." ## The Structure of `javac` The reference compiler for Java is `javac`. It comes with every [OpenJDK build,](https://www.youtube.com/watch?v=3bfR22iv8Pc) and it's what most people use to compile their java.<sup><a href="#25">25</a></sup> I will briefly explain some of its internal workings so that you have some context on what we changed and why. ### compiler.properties All the error message text for `javac` lives inside a set of [`compiler.properties`](https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/resources/compiler.properties) files. There is one file for each language that has translations. German text is in `compiler_de.properties`, Japanese in `compiler_ja.properties`, and so on. ```properties compiler.err.abstract.meth.cant.have.body=\ abstract methods cannot have a body ``` Each message is keyed in a way that indicates its purpose. `compiler.err.*` are for error messages, `compiler.warn.*` for warnings, and `compiler.misc.*` for [potpourri](https://www.youtube.com/watch?v=PaFSkWfFhO0). There are comments above messages with placeholders to indicate the type of data that needs to be filled in. ```properties # 0: name compiler.err.call.must.be.first.stmt.in.ctor=\ call to {0} must be first statement in constructor # 0: symbol kind, 1: name, 2: symbol kind, 3: type, 4: message segment compiler.err.cant.apply.symbol.noargs=\ {0} {1} in {2} {3} cannot be applied to given types;\n\ reason: {4} ``` These files get processed into Java classes by [this tooling](https://github.com/openjdk/jdk/blob/master/make/langtools/tools/propertiesparser/gen/ClassGenerator.java). The classes that get generated are subclasses of [`JCDiagnostic.DiagnosticInfo`](https://github.com/openjdk/jdk/blob/ec0cc6300a02dd92b25d9072b8b3859dab583bbd/src/jdk.compiler/share/classes/com/sun/tools/javac/util/JCDiagnostic.java#L52), where `compiler.err.*` properties are turned into instances of [`JCDiagnostic.Error`](https://github.com/openjdk/jdk/blob/ec0cc6300a02dd92b25d9072b8b3859dab583bbd/src/jdk.compiler/share/classes/com/sun/tools/javac/util/JCDiagnostic.java#L576), `compiler.warn.*` into [`JCDiagnostic.Warning`](https://github.com/openjdk/jdk/blob/ec0cc6300a02dd92b25d9072b8b3859dab583bbd/src/jdk.compiler/share/classes/com/sun/tools/javac/util/JCDiagnostic.java#L585C31-L585C38), and so on. The class [`CompilerProperties`](https://gist.github.com/bowbahdoe/3f6270a96260daa841329d7ef4a998d7) holds constants for each of these messages as well as static methods for the messages that had those special comments indicating that they need placeholders. ```java /** * compiler.err.anonymous.diamond.method.does.not.override.superclass=\ * method does not override or implement a method from a supertype\n\ * {0} */ public static Error AnonymousDiamondMethodDoesNotOverrideSuperclass( Fragment arg0 ) { return new Error( "compiler", "anonymous.diamond.method.does.not.override.superclass", arg0 ); } /** * compiler.err.array.and.receiver =\ * legacy array notation not allowed on receiver parameter */ public static final Error ArrayAndReceiver = new Error("compiler", "array.and.receiver"); // And so on ``` ### JCDiagnostic I've been mostly talking about "error messages," but they are ontologically just one kind of "diagnostic". `JCDiagnostic` - short for java compiler diagnostic I'm pretty sure - is the representation `javac` has for diagnostic messages. It stores references to all the information needed to construct a message shown to the user. This includes the `source` which the diagnostic references, the `position` in that source being referenced, as well as [other miscellaneous metadata](https://github.com/openjdk/jdk/blob/ec0cc6300a02dd92b25d9072b8b3859dab583bbd/src/jdk.compiler/share/classes/com/sun/tools/javac/util/JCDiagnostic.java#L462). A pointer to some text from the `compiler.properties` as well as the arguments needed for any placeholders in said text are stored in a sub-object under `diagnosticInfo`. These `DiagnosticInfo` objects come from what was generated in [`CompilerProperties`](https://gist.github.com/bowbahdoe/3f6270a96260daa841329d7ef4a998d7). The whole structure implements the [`Diagnostic`](https://docs.oracle.com/en/java/javase/20/docs/api/java.compiler/javax/tools/Diagnostic.html) interface, which is part of Java's public API.<sup><a href="#26">26</a></sup> ### Context One of the more interesting concepts in `javac` is its `Context` mechanism. In "regular" code that wants a singleton, you generally hide the constructor for your class and expose a single instance in some way. ```java final class Apple { private Apple() {} public static final Apple INSTANCE = new Apple(); } ``` The other option is to make your class "normal" but rely on some dependency injection framework to automatically create, manage, and provide singular instances of that class. ```java final class Apple { // ... } class UsageSite { private Apple apple; @Inject UsageSite(Apple apple) { this.apple = apple; } } ``` `javac` wants only single instances of many of its classes, but it also wants to allow for multiple instances of the compiler to run in parallel on the same JVM. The solution they use is to have one class, [`Context`](https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/util/Context.java), which holds a map of `Context.Key` to `Object`s. Classes like [`JCDiagnostic.Factory`](https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/util/JCDiagnostic.java#L56) have factory methods that do a get-or-create with their own constant `Context.Key`s. ```java public static class Factory { /** The context key for the diagnostic factory. */ protected static final Context.Key<JCDiagnostic.Factory> diagnosticFactoryKey = new Context.Key<>(); /** Get the Factory instance for this context. */ public static Factory instance(Context context) { Factory instance = context.get(diagnosticFactoryKey); if (instance == null) instance = new Factory(context); return instance; } // ... } ``` And then this `Context` is threaded to every class in the compiler that wants to get instances of those "contextual singletons" or themselves participate in the mechanism. ```java protected Flow(Context context) { context.put(flowKey, this); names = Names.instance(context); log = Log.instance(context); syms = Symtab.instance(context); types = Types.instance(context); chk = Check.instance(context); lint = Lint.instance(context); rs = Resolve.instance(context); diags = JCDiagnostic.Factory.instance(context); Source source = Source.instance(context); } ``` ### Log During compilation, if a problem is encountered anywhere, the compiler constructs and "emits" a diagnostic. It does this by getting an instance of the `Log` contextual singleton and using the generated constants from [`CompilerProperties`](https://gist.github.com/bowbahdoe/3f6270a96260daa841329d7ef4a998d7). ```java public class Operators { // ... protected Operators(Context context) { context.put(operatorsKey, this); syms = Symtab.instance(context); names = Names.instance(context); log = Log.instance(context); types = Types.instance(context); noOpSymbol = new OperatorSymbol( names.empty, Type.noType, -1, syms.noSymbol ); initOperatorNames(); initUnaryOperators(); initBinaryOperators(); } // ... private OperatorSymbol reportErrorIfNeeded( DiagnosticPosition pos, Tag tag, Type... args ) { if (Stream.of(args).noneMatch(t -> t.isErroneous() || t.hasTag(TypeTag.NONE) )) { Name opName = operatorName(tag); JCDiagnostic.Error opError = (args.length) == 1 ? Errors.OperatorCantBeApplied( opName, args[0] ) : Errors.OperatorCantBeApplied1( opName, args[0], args[1] ); log.error(pos, opError); } return noOpSymbol; } // ... } ``` `Log` internally holds onto the `JCDiagnostic.Factory` contextual singleton in order to construct `JCDiagnostic`s from `DiagnosticInfo`s like `JCDiagnostic.Error`.<sup><a href="#27">27</a></sup> ```java public class Log extends AbstractLog { // ... private Log( Context context, Map<WriterKind, PrintWriter> writers ) { super(JCDiagnostic.Factory.instance(context)); context.put(logKey, this); this.writers = writers; @SuppressWarnings("unchecked") // FIXME DiagnosticListener<? super JavaFileObject> dl = context.get(DiagnosticListener.class); this.diagListener = dl; diagnosticHandler = new DefaultDiagnosticHandler(); messages = JavacMessages.instance(context); messages.add(Main.javacBundleName); final Options options = Options.instance(context); initOptions(options); options.addListener(() -> initOptions(options)); } // ... } ``` All the diagnostics are then reported to a `DiagnosticHandler`. ```java public class Log extends AbstractLog { // ... @Override public void report(JCDiagnostic diagnostic) { diagnosticHandler.report(diagnostic); } // ... } ``` ### DiagnosticFormatter There are some steps I am skipping, but eventually diagnostics flow from `DiagnosticHandler`s to a `DiagnosticFormatter` which is responsible for formatting the diagnostic for display. There are a few implementations of `DiagnosticFormatter`, but the most relevant is [`BasicDiagnosticFormatter`](https://github.com/openjdk/jdk/blob/ec0cc6300a02dd92b25d9072b8b3859dab583bbd/src/jdk.compiler/share/classes/com/sun/tools/javac/util/BasicDiagnosticFormatter.java). BasicDiagnosticFormatter has three formats that it recognizes. Diagnostics with a position, diagnostics without a position, and diagnostics originating in a class file for which the source is not available. It uses custom format strings that describe how it should display each of those diagnostics. ```java private void initFormat() { initFormats("%f:%l:%_%p%L%m", "%p%L%m", "%f:%_%p%L%m"); } ``` For backwards compatibility reasons, `javac` also maintains an "old style" diagnostics format and a "normal" format. The old style diagnostics format can be enabled with compiler flags. ```java public static class BasicConfiguration extends SimpleConfiguration { public BasicConfiguration(Options options) { // ... initIndentation(); if (options.isSet("diags.legacy")) initOldFormat(); String fmt = options.get("diags.layout"); if (fmt != null) { if (fmt.equals("OLD")) initOldFormat(); else initFormats(fmt); } // ... } } ``` The format strings consist of "meta characters" that represent different components of the diagnostic. The meta characters and other components are formatted independently and then concatenated. For the normal format diagnostics with a position, each section of the string has a meaning as follows: | Component | Meaning | |-----------|--------------------------------------------------------------------------------| | %f | Source file name | | : | A literal ":" character (U+003A) | | %l | Line number for the diagnostic | | : | A literal ":" character (U+003A) | | %_ | A space character (U+0020) | | %p | The prefix for the diagnostic type: one of "Note: ", "warning: ", or "error: " | | %L | The lint category for this diagnostic, if it is a lint | | %m | The localized message for the diagnostic | After these components, the source code at the position of the diagnostic is inserted. The source code is inserted at the end of the diagnostic if the diagnostic message is a single line, or the source code is inserted after the first line of the diagnostic message if it is multiline.<sup><a href="#28">28</a></sup> ## Structural Problems If we want errors closer to what Rust has, the most important structural deficiency to tackle is `javac`'s message oriented-ness. By that I am referring to the fact that every kind of diagnostic is "just a message." There is no clearly delineated place to put hints or other context.<sup><a href="#29">29</a></sup> This is something that has already had to be worked around. Take the following program. ```java public class Main { public static void main(String[] args) { Object o = 123; switch (o) { case Integer i -> System.out.println(i); default -> {} }; } } ``` This program uses pattern switches, which are a preview feature. This is the error you would get if you tried to compile it. ``` Main.java:5: error: patterns in switch statements are a preview feature and are disabled by default. case Integer i -> System.out.println(i); ^ (use --enable-preview to enable patterns in switch statements) 1 error ``` While you might look at this and think that there is already a dedicated place for hints, you would be wrong. This message come from this entry in the `compiler.properties` file. ```properties # 0: message segment (feature) compiler.err.preview.feature.disabled.plural=\ {0} are a preview feature and are disabled by default.\n\ (use --enable-preview to enable {0}) ``` You will notice that the only thing separating the hint to use the `--enable-preview` flag from the initial message is a newline. [`BasicDiagnosticFormatter`](https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/util/BasicDiagnosticFormatter.java#L108) just has a heuristic where it assumes that any newline in a message means that the lines following it should be displayed below the code that is the source of the issue. ```java public class BasicDiagnosticFormatter extends AbstractDiagnosticFormatter { // ... public String formatMessage(JCDiagnostic d, Locale l) { // ... if (lines.length > 1 && getConfiguration() .getVisible() .contains(DiagnosticPart.DETAILS)) { currentIndentation += getConfiguration() .getIndentation(DiagnosticPart.DETAILS); for (int i = 1; i < lines.length; i++) { buf.append( "\n" + indent(lines[i], currentIndentation) ); } } // ... return buf.toString(); } // ... } ``` This causes a problem for diagnostics that have a newline that is not accounted for in that heuristic like the following.<sup><a href="#30">30</a></sup> ```properties # TODO 308: make a better error message compiler.err.this.as.identifier=\ as of release 8, ''this'' is allowed as the parameter name for the receiver type only\n\ which has to be the first parameter, and cannot be a lambda parameter ``` If you construct a program that triggers this error like so: ```java public class Math { static int add(int a, int this) { return a + this; } } ``` You will get the following. ``` Math.java:2: error: as of release 8, 'this' is allowed as the parameter name for the receiver type only static int add(int a, int this) { ^ which has to be the first parameter, and cannot be a lambda parameter 1 error ``` Which feels at the very least unintentional. There are other places where the pressure to say more than one thing in a message [leads to larger irreconcilable inconsistencies](https://mail.openjdk.org/pipermail/compiler-dev/2022-February/019105.html).<sup><a href="#31">31</a></sup> Every year or so there is someone who complains about [a specific error on the mailing list](https://mail.openjdk.org/pipermail/compiler-dev/2022-September/020420.html). This usually leads to a concrete improvement in whatever error they complain about, but the root problem here is structural.<sup><a href="#32">32</a></sup> ## Structural Solutions Structural problems always require structural solutions, so that's what we aimed to do. The way Rust deals with this is with [structured diagnostics](https://rustc-dev-guide.rust-lang.org/diagnostics.html).<sup><a href="#33">33</a></sup> We translated that approach into the existing `JCDiagnostic` world by introducing two new concepts: `Help` and `Info`. In our prototype all `JCDiagnostic`s now carry an optional `Help` and an optional `Info`. ```java public class JCDiagnostic implements Diagnostic<JavaFileObject> { // ... private final DiagnosticSource source; private final DiagnosticPosition position; private final DiagnosticInfo diagnosticInfo; private final Set<DiagnosticFlag> flags; private final LintCategory lintCategory; private final Info info; private final Help help; // ... } ``` ### Help "Help"s carry two pieces of information. ```java public record Help( HelpFragment message, List<SuggestedChange> suggestedChanges ) { // ... } ``` First is a message. This is a fragment of text just like other `DiagnosticInfo`s and it is where the actual text of "use --enable-preview to enable patterns in switch statements" would come from. ```java public static final class HelpFragment extends DiagnosticInfo { public HelpFragment(String prefix, String code, Object... args) { super(HELP, prefix, code, args); } } ``` The second is a list of zero or more suggested changes. ```java public record SuggestedChange( DiagnosticSource source, RangeDiagnosticPosition position, String replacement, Applicability applicability ) { // ... } ``` These `SuggestedChange`s all know where in the source they are referring to, what replacements to make in order to apply the suggestion, and to what degree the suggestion is mechanically applicable. ```java public enum Applicability { MACHINE_APPLICABLE, HAS_PLACEHOLDERS, UNKNOWN } ``` Help messages would include: * Suggesting importing a class or enum * Suggesting fixing a misspelled identifier to a similarly named identifier that actually exists * Explaining that an identifier doesn't exist, but it does exist in a class that is accessible * Suggesting changing arguments to a function to satisfy the signature ### Info `Info`s are similar in spirit to `Help`s, but they only provide helpful context. They do not suggest a user change their code in any concrete way. ```java public record Info( InfoFragment message, List<InfoPosition> positions ) { } ``` An `Info` holds a text fragment and a list of all the places in the code that are relevant to that message. Info messages would include: * Displaying related function signatures to show the programmer the expected signature * Displaying the declaration of a class, field, or local variable to show what type was expected * Providing more detailed information as to why something is not allowed, such as `assert` becoming a keyword in Java 1.4 * Displaying supplementary information related to valid values ### compiler.properties In order to get the text for help and info messages in a localization friendly way, we piggybacked on the existing conventions in the `compiler.properties` files. We updated the tooling so that properties keyed by `compiler.help.*` and `compiler.info.*` are translated into fragments inside `CompilerProperties` the same as was done for `compiler.error.*` and company. ```properties compiler.info.function.declared.here=\ function declared here # ... # 0: kind name, 1: symbol compiler.help.similar.symbol=\ a similarly named {0} exists: {1} ``` ```java public class CompilerProperties { public static class Infos { // ... /** * compiler.info.function.declared.here=\ * function declared here */ public static final InfoFragment FunctionDeclaredHere = new InfoFragment("compiler", "function.declared.here"); // ... } public static class Helps { // ... /** * compiler.help.similar.symbol=\ * a similarly named {0} exists: {1} */ public static HelpFragment SimilarSymbol( KindName arg0, Symbol arg1 ) { return new HelpFragment( "compiler", "similar.symbol", arg0, arg1 ); } // ... } } ``` At the sites where diagnostics are emitted, there `Help`s and `Info`s can then be attached to a `JCDiagnostic` by referencing these generated classes and using the `withHelp` or `withInfo` methods. ```java class SymbolNotFoundError extends ResolveError { // ... @Override JCDiagnostic getDiagnostic( JCDiagnostic.DiagnosticType dkind, DiagnosticPosition pos, Env<AttrContext> env, Type site, Name name, List<Type> argtypes, List<Type> typeargtypes ) { // ... if (hasLocation) { // ... if (suggestMember != null) { diag = diag.withHelp( new Help( Helps.SimilarSymbol( Kinds.kindName(suggestMember), suggestMember ) ) ); } return diag; } // ... } } ``` [The logic in the code above](https://github.com/McCue-Software-Solutions/jdk/blob/hints/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Resolve.java) produces errors like the following. ``` Test.java:5: error: cannot find symbol TetsingMetho(); ^ symbol: method TetsingMetho() location: class Test help: a similarly named method exists: TestingMethod() ``` What is important here isn't the analysis being performed or how it is displayed exactly. Those are all things that can be disputed by reasonable people. It's that there is now a place to give that sort of advice. With just that little bit of structure, it suddenly becomes tractable to build a feature like "suggest methods with similar names." ### DiagnosticFormatter In order to retrofit `BasicDiagnosticFormatter` to display help and info messages, we needed to add some more meta characters to its format strings. ```java public class BasicDiagnosticFormatter extends AbstractDiagnosticFormatter { // ... public static class BasicConfiguration extends SimpleConfiguration { // ... private void initFormatWithInfoAndHelp() { initFormats( "%f:%l:%_%p%L%m%i%h", "%p%L%m%i%h", "%f:%_%p%L%m%i%h" ); } // ... } // ... } ``` | Component | Meaning | |-----------|--------------------------------------------------------------------------------| | %h | Help Message | | %i | Info Message | This has the very convenient property that, at least for this design, we can put the new reporting format behind a flag. ```java public class BasicDiagnosticFormatter extends AbstractDiagnosticFormatter { // ... public static class BasicConfiguration extends SimpleConfiguration { // ... public BasicConfiguration(Options options) { // ... if (options.isSet(/* ... */)) { initFormatWithInfoAndHelp(); } else { initFormat(); } // ... } private void initFormat() { initFormats( "%f:%l:%_%p%L%m", "%p%L%m", "%f:%_%p%L%m" ); } private void initFormatWithInfoAndHelp() { initFormats( "%f:%l:%_%p%L%m%i%h", "%p%L%m%i%h", "%f:%_%p%L%m%i%h" ); } // ... } // ... } ``` ### Remaining Work There are many things left unfinished. * Some of our modifications show suboptimal positions in particularly complex cases. Fixing this involves passing more context down through the compiler and keeping more positional information. * The wording of our additional info can likely be improved to flow better with the writing style of the existing errors. * The messages that make use of the newline heuristic, especially the `--enable-preview` ones, can be moved to having an attached `Help`. Doing so would mean that either the keys for those messages would need to be duplicated or the original messages would be degraded if the flag for this were turned off. * Almost every error and warning could use tone audit. * In general, the compiler tends to drop information between phases. We didn't find a system other than to go case by case. and just thread through the information we need as we need it. * `compiler.err.cant.apply.symbol` is several different errors in a trench-coat. * Ich spreche kein Deutsch. * 私は日本語を話しません * 我不会说中文 * And much, much more But I am satisfied with the progress we made. You can find a bestiary of the specific errors that were tackled [here](https://docs.google.com/document/d/10QlIIvYKkrxcWs_GMxQMRXhakYWBYiYanbetHEWIji0/edit?usp=sharing). ## Call to Action Submitting a JEP, while technically an open process, in practice seems to be helped by having more free time than is reasonable<sup><a href="#34">34</a></sup>, being paid to work on OpenJDK, or social capital. If you want this work to continue, you should voice your support on [compiler-dev@openjdk.org](https://mail.openjdk.org/pipermail/compiler-dev/) or in whatever forum you think would have the most impact. If you do want to use the mailing lists, take note that you need to sign up for the mailing list yourself in order for your emails to go through. If you are interested in continuing this work yourself, the current state of this is on the `hints` branch [here](https://github.com/mccue-software-solutions/jdk/tree/hints). There is quite a bit left to do and a lot of arcane knowledge we picked up along the way, so please reach out if you choose this path. I can at least help you get set up in IntelliJ to build the JDK.<sup><a href="#35">35</a></sup> If you are in a position of authority at of one of the large companies that has dedicated staff working on OpenJDK, hire some or all of us. Failing that, dedicate other person-power to the issue.<sup><a href="#36">36</a></sup><sup><a href="#37">37</a></sup> ## JEP <details> <summary>JEP Draft</summary> Summary ------- Enhance the Java compiler with errors that are easier to read and understand. Goals ----- The primary goal is to make the reference Java compiler competitive with the compilers for other languages in terms of the helpfulness of its error messages. This is proposed to be accomplished by * Enhancing the compiler so that it can reliably display hints to the user for how to resolve issues. * Enhancing the compiler so that it can provide information to the user that indicates why an error occurred. * Auditing the tone of th existing set of messages. Non-Goals --------- It is not a priority to alter the set of warnings and lints that the compiler reports on, though that could feasibly come as a future JEP. It is also not a priority to give this same treatment to other tools, such as `jar`, `javadoc`, or `jlink`, though that could feasibly come as a future JEP. It is not a goal to provide an equivalent to the `--explain` flag present in other compilers or to assign error conditions unique numeric codes, though that could feasibly come as a future JEP. It is not a goal to provide a structured output for consumption by IDEs or tools, though that could feasibly come as a future JEP. It is not an explicit goal to enhance the API provided by `java.compiler` to allow annotation processors and other user code to introspect on any new functionality, though that might fall out of the design process. It is a non-goal to completely refactor the entire javac diagnostic process or to modify every single error message. Some are fine as they are. It is not a goal to provide any specific kind of analysis, though it is assumed that some new analyses should be performed. Description ----------- (The following represents a preliminary design and is subject to change) We propose modifying the structure of the `javac`'s diagnostic system by adding `Help` and `Info` structures to `JCDiagnostic`. `Help`s would provide information to users with suggestions on how to change their code. The localized text for `Help`s would be provided by properties keyed under `compiler.help.*`. Information in a help message should be actionable information or ideally a functional code suggestion. Each code suggestion in a help contains a range in the source code that it applies to, a string to replace the source code with, and an enum representing whether the suggestion can be applied automatically, needs some manual work, or can’t be automatically applied. `Info`s would provide useful context to users on why they are receiving a given warning or error. The localized text for `Info`s would be provided by properties keyed under `compiler.info.*`. The code to format diagnostics for display will also be updated to support embedding these two pieces of information and existing messages will be updated to make appropriate use of the new structure. Alternatives ------------ * Do nothing. There is significant opportunity cost in any restructuring, and it might be the case that a restructuring of error messages do not provide enough practical benefit to be worth the effort. * Wait. It's certainly possible that the ideas currently available represent a local maxima of compiler design and that going toward them would be a misstep. * Defer to the community. It might be possible to accomplish most of these goals with an alternative compiler implementation. `javac`'s primary purpose is to be a reference compiler for the JLS. This would limit its exposure though, especially to beginners. * Leave it to IDEs. Visual feedback is, in some ways, preferable to textual feedback. Not all categories of users, particularly the users for which feedback matters the most (students, beginners, etc.), are able or willing to use IDEs. Testing ------- The compiler should still give errors in the same situations before changes as after and still emit identical bytecode. The existing set of `jtreg` tests should be enough to validate that, though they will need to be updated to test for items related the new structure. Testing for whether errors are actually useful is a social problem. Risks and Assumptions --------------------- Localization will likely pose a challenge. The current corpus of error messages is significant and would need to be updated. There are also undoubtedly tools in the ecosystem that function off of parsing the exact structure emitted by javac. Ultimately either they will need to be broken, the new functionality would have to be hidden behind a compiler flag, or the old functionality will have to be explicitly enabled. It is possible that keeping track of the information needed to provide a good hint to the user could increase the memory footprint of the compiler during successful runs. The speed of the compiler could be also affected. Ideally this could be minimized, but it is hard to know in what way until an MVP is in place. Dependencies ------------ What hints should be given in which scenarios should conceptually be driven by data on what error conditions are actually hit. This can be further stratified by which errors are hit by different groups such as "total beginners", "working developers", etc. If there is not already a corpus of this sort of data then it would be prudent to try to organize a way to gather it. There are active projects like Valhalla and Amber that will likely result in significant updates to the compiler. It might be necessary to wait for those changes to "blow over" before there is enough stability to do make structural changes. These projects also alter the semantics of the language, so they could feasibly affect what an ideal error messages would be. </details> <p id="1" style="font-size: 14px">1: And well over a year of mental illness from me.</p> <p id="2" style="font-size: 14px">2: This definition isn't exactly accurate. JITs are compilers too but they do their work in memory. There is no artifact to speak of, but I think its close enough to what most people think of when they think of a compiler. The <a href="https://en.wikipedia.org/wiki/Compiler">wikipedia entry for compilers</a> has a more accurate definition.</p> <p id="3" style="font-size: 14px">3: Languages like C, C++, Rust, Zig, etc. also care quite a lot about optimizations. It might be a little disingenuous of me to gloss over that, but this post is about a language backed by a VM. The real optimizations happen at runtime and dwelling too long on that felt a bit much.</p> <p id="4" style="font-size: 14px">4: The Elm language is closest in spirit to Haskell. All code needs to be purely functional, the type system is strong, and the syntax is similar as well. What sets it apart is that it doesn't have Haskell classic features like type-classes, do-notation, or `IO` monads. Good case study in addition by subtraction.</p> <p id="5" style="font-size: 14px">5: I'm attributing causation here, but I don't actually have strong proof that the better error messages work benefited from Elm being small and focused. At least in my brain it tracks though. Evan Czaplicki is but one man. Doing what he did but with C++ would have been infeasible.</p> <p id="6" style="font-size: 14px">6: If twitter dies, just know that these links were to people saying nice things about Elm.</p> <p id="7" style="font-size: 14px">7: This isn't actually that good of an argument in and of itself. I know that. I just find it rhetorically compelling. I love Elden Ring, but I don't want to be playing it at every day at my job and I wouldn't want to force someone to deal with all the flavors of "lol, get rekt" it entails.</p> <p id="8" style="font-size: 14px">8: "Basic analysis" is doing a lot of legwork in that sentence. Its basic in large part because of the simplicity of the Elm language. Libraries, modules, and source files work in precisely one way. Doing the same kind of analysis with classes on the class-path, modules on the module-path, and code yet to be compiled might be quite a bit harder for Java.</p> <p id="9" style="font-size: 14px">9: I'm doing it for the example directly below, but it's hard to get the pandoc renderer which makes this site to like inline spans with color information. I'll only do that work when I'm talking about coloring and, because I'm writing these footnotes as I go, I won't rule out just falling back to images.</p> <p id="10" style="font-size: 14px">10: This is nowhere near a complete explanation of the borrow checker, how it functions, or what it accomplishes. For that you should read the Rust book or any of the many good references online. I'm trusting that most readers will have a passing familiarity.</p> <p id="11" style="font-size: 14px">11: This explanation conflates a little bit borrow checking and lifetime checking. There are also no "borrows" that happen in the following code, but I've seen "the borrow checker" used interchangably with what could be called "the lifetime checker" and at the level of zoom this post is at I think the difference isn't that crucial. Doesn't help that I'm not exactly sure where lines for these terms are regardless.</p> <p id="12" style="font-size: 14px">12: I think there is decent evidence for this, but it is really hard to prove.</p> <p id="#13" style="font-size: 14px">13: The initial design for which seems to have been directly inspired by Elm.</p> <p id="14" style="font-size: 14px">14: During the aforementioned "fighting the borrow checker" phase.</p> <p id="15" style="font-size: 14px">15: You could probably make this same argument for any part of the Rust ecosystem like the polish in Cargo, Clippy, etc.</p> <p id="16" style="font-size: 14px">16: Compare how "cannot find symbol" sounds versus "cannot find value `A` in this scope". The difference is subtle, but it matters.</p> <p id="17" style="font-size: 14px">17: IntelliJ does this with a "light bulb" that appears when you hover over a bit of malformed code that contains a contextual menu with potential fixes.</p> <p id="18" style="font-size: 14px">18: I do not remember exactly what they are called, but my old High School had a bunch of machines that were basically just empty boxes that connected to a shared Windows server.</p> <p id="19" style="font-size: 14px">19: I hesitate to mention this, but the CS gender ratio is absolutely wack <a href="https://www.smithsonianmag.com/smithsonian-institution/margaret-hamilton-led-nasa-software-team-landed-astronauts-moon-180971575/">when famously it did not start out that way</a>. There are lots of shitty reasons for this, and I'm not saying that its Java's fault, but it is worthy of note that Java has been an <a href="https://www.infoworld.com/article/2076867/educators-embrace-java.html">extremely popular first language in education ever since its release</a> and as a consequence <a href="https://www.aei.org/carpe-diem/chart-of-the-day-the-declining-female-share-of-computer-science-degrees-from-28-to-18/"> has presided over this depressing chart</a>. I can't prove it<sup><a href="#20">20</a></sup> but I believe that small frictions like `public static void main` and obtuse error messages filter students down to the demographics that are willing to deal with that sort of hostility.</p> <p id="20" style="font-size: 14px">20: I haven't been able to find any meaningful research to back up or refute this claim. That must be because there is none, I'm bad at searching, or I've unconsciously ignored evidence that refutes my point.</p> <p id="21" style="font-size: 14px">21: "I'm right and science hasn't caught up to how right I am."</p> <p id="22" style="font-size: 14px">22: I will later be shown that I was at least somewhat wrong about this.</p> <p id="23" style="font-size: 14px">23: No relation.</p> <p id="24" style="font-size: 14px">24: These are the original <a href="https://docs.google.com/presentation/d/1JnqZwxYHAFI80JjHcoU7e8SI8U-I3CXnA1KM0Ryt4Q8/edit?usp=sharing">PowerPoint</a> and <a href="https://docs.google.com/document/d/10QlIIvYKkrxcWs_GMxQMRXhakYWBYiYanbetHEWIji0/edit?usp=sharing">Design Document</a> they produced for their class. </p> <p id="25" style="font-size: 14px">25: Other compilers like ECJ exist, but being the default means always mean javac is going to be what most people use. That's why the focus was there, it is where changes can do the most good.</p> <p id="26" style="font-size: 14px">26: Something that is merciful about wanting to change the internals of javac is that things like <code>Diagnostic</code> are all I need to worry about. If we're not changing the set of programs accepted by the Java compiler and aren't looking to change supported APIs, then we should be able to avoid sanction by Java's backwards compatibility policies.</p> <p id="27" style="font-size: 14px">27: The FIXME in this code is from 2011.</p> <p id="28" style="font-size: 14px">28: This explanation was taken almost verbatim from the students' paper.</p> <p id="29" style="font-size: 14px">29: Giving the information a dedicated and delineated position.</p> <p id="30" style="font-size: 14px">30: This FIXME is from 2013.</p> <p id="31" style="font-size: 14px">31: I'll take "irreconcilable inconsistencies" for $500.</p> <p id="32" style="font-size: 14px">32: Structural issues like this I don't think are because of incompetence. The goal of `javac` has always been to be a reference compiler for Java. Everyone wants "good" errors, but the status quo is a result of the natural trend when a diagnostic ~= a line of output.</p> <p id="33" style="font-size: 14px">33: Structure, structure, structure.</p> <p id="34" style="font-size: 14px">34: If you get enough commits into OpenJDK and are a committer you can submit a JEP. That requires fixing a lot of JDK issues and going through a nomination process. I have Hogans Heroes to binge.</p> <p id="35" style="font-size: 14px">35: That took awhile for us to figure out even with the tutorials that exist.</p> <p id="36" style="font-size: 14px">36: Even if you aren't convinced that the approach we've taken is the one to continue with, hopefully you recognize that there is a real problem here.</p> <p id="37" style="font-size: 14px">37: I, at least, am not particularly special. I do however have a lot of context on this area of the code by now.</p> Sun, 13 Aug 0023 05:00:00 +0000Make your own Optionalshttps://mccue.dev/pages/3-28-23-custom-optional This is `java.util.Optional`. I took out all the comments and did a little reformatting, but this is the entire class. Just around 150 lines managing one nullable field. Take a minute to read or skim it before moving on. ```java public final class Optional<T> { private static final Optional<?> EMPTY = new Optional<>(null); private final T value; public static<T> Optional<T> empty() { @SuppressWarnings("unchecked") Optional<T> t = (Optional<T>) EMPTY; return t; } private Optional(T value) { this.value = value; } public static <T> Optional<T> of(T value) { return new Optional<>(Objects.requireNonNull(value)); } @SuppressWarnings("unchecked") public static <T> Optional<T> ofNullable(T value) { return value == null ? (Optional<T>) EMPTY : new Optional<>(value); } public T get() { if (value == null) { throw new NoSuchElementException("No value present"); } return value; } public boolean isPresent() { return value != null; } public boolean isEmpty() { return value == null; } public void ifPresent(Consumer<? super T> action) { if (value != null) { action.accept(value); } } public void ifPresentOrElse( Consumer<? super T> action, Runnable emptyAction ) { if (value != null) { action.accept(value); } else { emptyAction.run(); } } public Optional<T> filter(Predicate<? super T> predicate) { Objects.requireNonNull(predicate); if (!isPresent()) { return this; } else { return predicate.test(value) ? this : empty(); } } public <U> Optional<U> map( Function<? super T, ? extends U> mapper ) { Objects.requireNonNull(mapper); if (!isPresent()) { return empty(); } else { return Optional.ofNullable(mapper.apply(value)); } } public <U> Optional<U> flatMap( Function<? super T, ? extends Optional<? extends U>> mapper ) { Objects.requireNonNull(mapper); if (!isPresent()) { return empty(); } else { @SuppressWarnings("unchecked") Optional<U> r = (Optional<U>) mapper.apply(value); return Objects.requireNonNull(r); } } public Optional<T> or( Supplier<? extends Optional<? extends T>> supplier ) { Objects.requireNonNull(supplier); if (isPresent()) { return this; } else { @SuppressWarnings("unchecked") Optional<T> r = (Optional<T>) supplier.get(); return Objects.requireNonNull(r); } } public Stream<T> stream() { if (!isPresent()) { return Stream.empty(); } else { return Stream.of(value); } } public T orElse(T other) { return value != null ? value : other; } public T orElseGet(Supplier<? extends T> supplier) { return value != null ? value : supplier.get(); } public T orElseThrow() { if (value == null) { throw new NoSuchElementException("No value present"); } return value; } public <X extends Throwable> T orElseThrow( Supplier<? extends X> exceptionSupplier ) throws X { if (value != null) { return value; } else { throw exceptionSupplier.get(); } } @Override public boolean equals(Object obj) { if (this == obj) { return true; } return obj instanceof Optional<?> other && Objects.equals(value, other.value); } @Override public int hashCode() { return Objects.hashCode(value); } @Override public String toString() { return value != null ? ("Optional[" + value + "]") : "Optional.empty"; } } ``` ## Why does `Optional` exist `java.util.Optional` was introduced in Java 8 alongside the `Stream` API. Its raison d'etre is to make coders explicitly consider what to do when using methods like `findFirst` on a potentially empty `Stream`. ```java // Explicitly throws int valueOne = list .stream() .map(x -> x + 1) .filter(x -> x % 2 == 0) .findFirst() .orElseThrow() // Explicitly uses a default value int valueTwo = list .stream() .map(x -> x + 1) .filter(x -> x % 2 == 0) .findFirst() .orElse(0) ``` The deficiency it targets is in the interaction between `null` and "method chaining style". When there are so many method calls stacked up, it is hard for people to remember to handle cases like `null` return values. So with streams poised to encourage method chaining, `Optional` was needed to make that API not lead to hidden bugs. ## What's wrong with `Optional`? Nothing really. The core tension that leads to so much discourse is that there is no way to represent `null` in Java's type system. Whether from lived experience or religious fervor, folks tend to be afraid of an unaccounted for `null`. Because `Optional` is in the standard library and explicitly represents "absence or presence", it is extremely tempting to just replace every nullable thing with an `Optional<T>`. Doing this can lead to code that sucks, especially if you try to avoid `null` for local variables. ```java // Some might try to use isPresent()/get() to avoid null Optional<String> nameOpt = f(); Optional<Integer> ageOpt = g(); if (nameOpt.isPresent() && ageOpt.isPresent()) { var name = nameOpt.get(); var age = nameOpt.get(); System.out.println( name + " is " + age + " years old" ); } ``` ```java // Others might try to map/flatMap. Optional<String> nameOpt = f(); Optional<Integer> ageOpt = g(); nameOpt .flatMap(name -> ageOpt.map(age -> { System.out.println( name + " is " + age + " years old" ); })) ``` ```java // But its questionable what's gained over null. String name = f().orElse(null); Integer age = g().orElse(null); if (name != null && age != null) { System.out.println( name + " is " + age + " years old" ); } ``` But this is honestly _fine_. Yes, the `Optional` will use up more memory and perform a bit worse than the equivalent code with `null`. Yes, code written with `isPresent`/`get`/`orElseThrow` or `map`/`flatMap` can be a bit crusty. Yes, it wasn't intended to be a field or a method parameter. There are a lot of bike sheds to build and "best practices" to get into internet fights over. But [`jspecify`](https://jspecify.dev/) is poised to give standard nullability annotations and tooling to augment the type system with them. Project Valhalla is considering giving a way to express [null restricted storage](https://openjdk.org/jeps/401). In the fullness of time, the core tension that leads to this "overuse" seems like it will be resolved. The problem with both `Optional` and `null` is that they _only_ convey that some data might be absent and not what being absent implies. ## The Meaning of Absence Say you were writing a program which had to record peoples' first and last names for legal reasons. Users can still sign up, but they will need to give that information before continuing on to other parts of the app. Today you might see `Optional` being used to represent that. ```java import java.util.Optional; record Person( int id, Optional<String> firstName, Optional<String> lastName ) {} ``` In the near future, maybe a `@Nullable` annotation. ```java import org.jspecify.annotations.Nullable; record Person( int id, @Nullable String firstName, @Nullable String lastName ) {} ``` In both cases - `null` and an empty `Optional` - an absent value implies that you have not been given that information yet. You can use this to know when to stop a user and ask them for their name. ```java import java.util.Optional; record Person( int id, Optional<String> firstName, Optional<String> lastName ) { boolean shouldAskForInfo() { return firstName.isEmpty() || lastName.isEmpty(); } } ``` ```java import org.jspecify.annotations.Nullable; record Person( int id, @Nullable String firstName, @Nullable String lastName ) { boolean shouldAskForInfo() { return firstName == null || lastName == null; } } ``` Now, consider Madonna. Madonna does not have a last name. If a `null` or empty value in the `lastName` field means "not provided", you have no way to directly represent "known to not exist." ```java // Need to ask Bob for his last name still var bob = new Person(1, "Bob", null); // Shouldn't ask Madonna for anything var madonna = new Person(2, "Madonna", null); ``` Using an empty string is tempting, but if you do that you will have the same problem that `null` currently has. By having a "special" value not expressed in the type system, you are liable to forget to check for that special value. ```java // Empty string can be a sentinel var madonna = new Person(2, "Madonna", ""); // But if you forget that it is special // you might give Madonna a subpar user experience var welcome = "Hello " + person.firstName() + " " + person.lastName() + "!"; // "Hello Madonna !" // She'll notice. She'll hate you. ``` The reality of our fictional data model is that we have three distinct cases. 1. We have not been given a last name. 2. We have been told there is no last name. 3. We have been given a last name. The most convenient tool we have for representing this sort of situation is a `sealed interface`. ```java sealed interface LastName { record NotGiven() implements LastName {} record DoesNotExist() implements LastName {} record Given(String value) implements LastName {} } ``` Now when a `LastName` has an absent value, we can know whether that is because it doesn't exist or we just haven't been told. ```java import org.jspecify.annotations.Nullable; record Person( int id, @Nullable String firstName, LastName lastName ) { boolean shouldAskForInfo() { return firstName == null || lastName instanceof LastName.NotGiven; } } ``` And we can properly represent Madonna. ```java // Need to ask Bob for his last name still var bob = new Person(1, "Bob", new LastName.NotGiven()); // Shouldn't ask Madonna for anything var madonna = new Person(2, "Madonna", new LastName.DoesNotExist()); // Joe is all set var joe = new Person(3, "Joe", new LastName.Given("Shmoe")); ``` `Optional` and `null` let you represent exactly 2 possibilities, a sealed hierarchy lets you represent 2 or more possibilities. The reason I'm using the Madonna example is that it is a straw-man where you want to represent 3 distinct possibilities. My bold claim is that even when there are only 2 possibilities you should still consider making your own class instead of using `Optional` or `null`. Both `@Nullable String firstName` and `Optional<String> firstName` do not directly convey what it means if the data is missing. Its just "absent." The fact that it means you haven't been told is context external to your domain model. It's a similar problem to [primitive obsession](https://sandimetz.com/blog/2014/9/9/shape-at-the-bottom-of-all-things). Because `null` and `Optional` are there and fit the "shape" we want we gravitate to them. What if instead of that we were to make our own "optional" class. ```java sealed interface FirstName { record NotGiven() implements FirstName {} record Given(String value) implements FirstName {} } ``` So here `FirstName` is identical in spirit to an `Optional<String>`, but with the benefit of us being able to give a name to the situation where there is no value. It's not empty or present, we were either given a first name or we weren't. With pattern matching you will be able switch over these two situations. ```java switch (person.firstName()) { case FirstName.NotGiven _ -> System.out.println("No first name"); case FirstName.Given(String name) -> System.out.println("First name is " + name); } ``` And part of the reason I put all the code for `Optional` at the top was to impress upon you how trivial it would be to add any of those helper methods to a class you made yourself. ```java sealed interface FirstName { record NotGiven() implements FirstName { @Override public String orElse(String defaultValue) { return defaultValue; } } record Given(String value) implements FirstName { @Override public String orElse(String defaultValue) { return this.value; } } String orElse(String defaultValue); } ``` ```java var name = person.name().orElse("?"); ``` That's it. That's the thesis. If you are spending time modeling your domain objects, consider making your own versions of an `Optional` class. You can choose names which more align with your domain, adapt to more varied situations, and the boilerplate for doing so is at a historic low. I will admit that if you have a huge number of fields with potentially missing data this can be more trouble than its worth. I still think its worth considering.Tue, 28 Mar 0023 05:00:00 +0000Please try my JSON libraryhttps://mccue.dev/pages/2-26-23-json <meta property="og:image" content="https://mccue.dev/pages/2-26-23-bopbop.png" /> <meta name="twitter:image" content="https://mccue.dev/pages/2-26-23-bopbop.png"> [![bowbahdoe/json - GitHub](https://gh-card.dev/repos/bowbahdoe/json.svg)](https://github.com/bowbahdoe/json) For the past four months I've been working on [a JSON library for Java](https://github.com/bowbahdoe/json). It's not original. Most of the implementation of the parser I stole from Clojure's [data.json](https://github.com/clojure/data.json) and the user facing API is a total ripoff of [Elm](https://elm-lang.org/)'s [JSON library](https://package.elm-lang.org/packages/elm/json/latest/). The only novel engineering I've done has been in translation. It's also not amazingly fast. Last I benchmarked it, it was around [5x as slow as Jackson](https://github.com/fabienrenaud/java-json-benchmark/issues/59#issuecomment-1345650635), the current king of Java's JSON castle. There are paths to improve that but, whether for lack of time or ability, I haven't explored any of them. Despite all that, I think you should try it. The rest of this post is going to be an effort to convince you to do so. First, I am going to go through a basic tutorial to get you up to speed. Then I am going to go through some pitches that I hope convince you. ## Tutorial ### The Data Model JSON is a data format. It looks like the following sample. ```json { "name": "kermit", "wife": null, "girlfriend": "Ms. Piggy", "age": 22, "children": [ { "species": "frog", "gender": "male" }, { "species": "pig", "gender": "female" } ], "commitmentIssues": true } ``` In JSON you represent data using a combination of objects (maps from strings to JSON), arrays (ordered sequences of JSON), strings, numbers, true, false, and null. Therefore, one "natural" way to think about the data stored in a JSON document is as the union of those possibilities. ``` JSON is one of - a map of string to JSON - a list of JSON - a string - a number - true - false - null ``` The way to represent this in Java is using a sealed interface, which provides an explicit list of types which are allowed to implement it. ```java public sealed interface Json permits JsonObject, JsonArray, JsonString, JsonNumber, JsonBoolean, JsonNull { } ``` This means that if you have a field or variable which has the type `Json`, you know that it is either a `JsonObject`, `JsonArray`, `JsonString`, `JsonNumber`, `JsonBoolean`, or `JsonNull`. That is the first thing provided by my library. There is a `Json` type and subtypes representing those different cases. ```java import dev.mccue.json.*; public class Main { static Json greeting() { return JsonString.of("hello"); } public static void main(String[] args) { Json json = greeting(); switch (json) { case JsonObject object -> System.out.println("An object"); case JsonArray array -> System.out.println("An array"); case JsonString str -> System.out.println("A string"); case JsonNumber number -> System.out.println("A number"); case JsonBoolean bool -> System.out.println("A boolean"); case JsonNull __ -> System.out.println("A json null"); } } } ``` You can create instances of these subtypes using factory methods on the types themselves. ```java import dev.mccue.json.*; import java.util.List; import java.util.Map; public class Main { public static void main(String[] args) { JsonObject kermit = JsonObject.of(Map.of( "name", JsonString.of("kermit"), "age", JsonNumber.of(22), "commitmentIssues", JsonBoolean.of(true), "wife", JsonNull.instance(), "children", JsonArray.of(List.of( JsonString.of("Tiny Tim") )) )); System.out.println(kermit); } } ``` Or by using factory methods on `Json`, which aren't guaranteed to give you any specific subtype but in exchange will handle converting any stray `null`s to `JsonNull`. ```java import dev.mccue.json.*; import java.util.List; import java.util.Map; public class Main { public static void main(String[] args) { Json kermit = Json.of(Map.of( "name", Json.of("kermit"), "age", Json.of(22), "commitmentIssues", Json.of(true), "wife", Json.ofNull(), "children", Json.of(List.of( JsonString.of("Tiny Tim") )) )); System.out.println(kermit); } } ``` For `JsonObject` and `JsonArray`, there also use builders available which can make it so that you don't need to write `Json.of` on every value. ```java import dev.mccue.json.Json; public class Main { public static void main(String[] args) { Json kermit = Json.objectBuilder() .put("name", "kermit") .put("age", 22) .putTrue("commitmentIssues") .putNull("wife") .put("children", Json.arrayBuilder() .add("Tiny Tim")) .build(); System.out.println(kermit); } } ``` ### Writing Once you have some `Json` you can write it out to a `String` using `Json.writeString` ```java import dev.mccue.json.Json; public class Main { public static void main(String[] args) { Json songJson = Json.objectBuilder() .put("title", "Rainbow Connection") .put("year", 1979) .build(); String song = Json.writeString(songJson); System.out.println(song); } } ``` ```json {"title":"Rainbow Connection","year":1979} ``` If output is meant to be consumed by humans then whitespace can be added using a customized instance of `JsonWriteOptions`. ```java import dev.mccue.json.Json; import dev.mccue.json.JsonWriteOptions; public class Main { public static void main(String[] args) { Json songJson = Json.objectBuilder() .put("title", "Rainbow Connection") .put("year", 1979) .build(); String song = Json.writeString( songJson, new JsonWriteOptions() .withIndentation(4) ); System.out.println(song); } } ``` ```json { "title": "Rainbow Connection", "year": 1979 } ``` If you want to write JSON to something other than a `String`, you need to obtain a `Writer` and use `Json.write`. ```java import dev.mccue.json.Json; import java.io.IOException; import java.nio.file.Files; import java.nio.file.Path; public class Main { public static void main(String[] args) throws IOException { Json songJson = Json.objectBuilder() .put("title", "Rainbow Connection") .put("year", 1979) .build(); try (var fileWriter = Files.newBufferedWriter( Path.of("song.json")) ) { Json.write(songJson, fileWriter); } } } ``` ### Encoding To turn a class you have defined into JSON, you just need to make a method which creates an instance of `Json` from the data stored in your class. ```java import dev.mccue.json.Json; record Muppet(String name) { Json toJson() { return Json.objectBuilder() .put("name", name) .build(); } } public class Main { public static void main(String[] args) { var beaker = new Muppet("beaker"); Json beakerJson = beaker.toJson(); System.out.println(Json.writeString(beakerJson)); } } ``` This process is "encoding." You "encode" your data into JSON and then "write" that JSON to some output. For classes that you did not define, the logic for the conversion just needs to live somewhere. Dealer's choice where, but static methods are generally a good call. ```java import dev.mccue.json.Json; import java.time.Month; import java.time.MonthDay; import java.time.format.DateTimeFormatter; final class TimeEncoders { private TimeEncoders() {} static Json monthDayToJson(MonthDay monthDay) { return Json.of( DateTimeFormatter.ofPattern("MM-dd") .format(monthDay) ); } } record Muppet(String name, MonthDay birthday) { Json toJson() { return Json.objectBuilder() .put("name", name) .put( "birthday", TimeEncoders.monthDayToJson(birthday) ) .build(); } } public class Main { public static void main(String[] args) { var elmo = new Muppet( "Elmo", MonthDay.of(Month.FEBRUARY, 3) ); Json elmoJson = elmo.toJson(); System.out.println(Json.writeString(elmoJson)); } } ``` ```json {"name":"Elmo","birthday":"02-03"} ``` If a class you define has a JSON representation that could be considered "canonical", the interface `JsonEncodable` can be implemented. This will let you pass an instance of the class directly to `Json.writeString` or `Json.write`. ```java import dev.mccue.json.Json; import dev.mccue.json.JsonEncodable; record Muppet(String name, boolean great) implements JsonEncodable { @Override public Json toJson() { return Json.objectBuilder() .put("name", name) .put("great", great) .build(); } } public class Main { public static void main(String[] args) { var gonzo = new Muppet("Gonzo", true); System.out.println(Json.writeString(gonzo)); } } ``` ### Reading The inverse of writing JSON is reading it. If you have some JSON stored in a `String` you can read it into `Json` using `Json.readString`. ```java import dev.mccue.json.Json; public class Main { public static void main(String[] args) { Json movie = Json.readString(""" { "title": "Treasure Island", "cast": [ { "name": "Kermit", "role": "The Captain", "muppet": true }, { "name": "Tim Curry", "role": "Long John Silver", "muppet": false } ] } """); System.out.println(movie); } } ``` If that JSON is coming from another source, you need to obtain a `Reader` and use `Json.read`. ```java import dev.mccue.json.Json; import java.io.IOException; import java.io.Reader; import java.nio.file.Files; import java.nio.file.Path; public class Main { public static void main(String[] args) throws IOException { // If you were following along, we created this earlier! Json song; try (Reader fileReader = Files.newBufferedReader( Path.of("song.json")) ) { song = Json.read(fileReader); } System.out.println(song); } } ``` If the JSON you provide is malformed in some way, a `JsonReadException` will be thrown. ```java import dev.mccue.json.Json; public class Main { public static void main(String[] args) { // Should be in quotes Json.readString("fozzie"); } } ``` ```java Exception in thread "main" dev.mccue.json.JsonReadException: JSON error (unexpected character): f at dev.mccue.json.JsonReadException.unexpectedCharacter(JsonReadException.java:33) at dev.mccue.json.internal.JsonReaderMethods.readStream(JsonReaderMethods.java:525) at dev.mccue.json.internal.JsonReaderMethods.read(JsonReaderMethods.java:533) at dev.mccue.json.internal.JsonReaderMethods.readFullyConsume(JsonReaderMethods.java:543) at dev.mccue.json.Json.readString(Json.java:369) at dev.mccue.json.Json.readString(Json.java:364) at dev.mccue.example.Main.main(Main.java:9) ``` ### Decoding Up to this point, everything has been more or less the same as it is for other "tree-based" JSON libraries like [org.json](https://github.com/stleary/JSON-java) or [json-simple](https://github.com/fangyidong/json-simple). This is where that will start to change. To take some `Json` and turn it into a user defined class, a basic approach would be to use `instanceof` checks to see if the `Json` is a particular subtype and navigate from there. ```java import dev.mccue.json.*; record Muppet(String name, boolean canSpeak) { static Muppet fromJson(Json json) { if (json instanceof JsonObject object && object.get("name") instanceof JsonString name && object.get("canSpeak") instanceof JsonBoolean canSpeak) { return new Muppet(name.toString(), canSpeak.value()); } else { throw new RuntimeException("Invalid Muppet"); } } } public class Main { public static void main(String[] args) { var json = Json.readString(""" { "name": "animal", "canSpeak": false } """); var animal = Muppet.fromJson(json); System.out.println(animal); } } ``` This process is "decoding." You "read" your data into JSON and then "decode" it to some type you define. The problem with the `instanceof` approach is that you will end up with bad error messages on unexpected data. In this case the error message would just be `"Invalid Muppet"`. The code to get better errors is tedious to write and I haven't seen many folks in the wild do it. To get good errors, you should use the static methods defined in `JsonDecoder`. ```java package dev.mccue.example; import dev.mccue.json.*; record Muppet(String name, boolean canSpeak) { static Muppet fromJson(Json json) { return new Muppet( JsonDecoder.field( json, "name", JsonDecoder::string ), JsonDecoder.field( json, "canSpeak", JsonDecoder::boolean_ ) ); } } public class Main { public static void main(String[] args) { var json = Json.readString(""" { "name": "animal", "canSpeak": false } """); var animal = Muppet.fromJson(json); System.out.println(animal); } } ``` These handle the fiddly process of checking whether the JSON matches the structure you expect and throwing an appropriate error. You should read this declaration as "at the field `name` I expect a string." ```java JsonDecoder.field(json, "name", JsonDecoder::string) ``` If the JSON is not an object, or doesn't have a value for `name`, or that value is not a string, you will get a `JsonDecodeException`. ```java public class Main { public static void main(String[] args) { var json = Json.readString(""" { "canSpeak": false } """); var animal = JsonDecoder.field( json, "name", JsonDecoder::string ); System.out.println(animal); } } ``` Which will have a message indicating exactly what went wrong and where. ```java Problem with the value at json.name: { "canSpeak": false } no value for field ``` The last argument to `JsonDecoder.field` is the `JsonDecoder` you want to use to interpret the value at that field. In this case a method reference to `JsonDecoder.string`, which is a method that asserts JSON is a string and throws if it isn't. For the methods which take more than one argument, there are overloads which can be used to get an instance of `JsonDecoder`. ```java // This will actually decode the json into a list of strings List<String> items = JsonDecoder.array(json, JsonDecoder::string); // This will just return a decoder Decoder<List<String>> decoder = JsonDecoder.array(JsonDecoder::string); ``` This, in conjunction with `JsonDecoder.field` is how you are intended to explore nested paths. ```java public class Main { public static void main(String[] args) { var json = Json.readString(""" { "villains": ["constantine", "doc hopper"] } """); List<String> villains = JsonDecoder.field( json, "villains", JsonDecoder.array(JsonDecoder::string) ); System.out.println(villains); } } ``` To decode JSON into your custom classes, you should add either a constructor or a static factory method which takes in `Json` and use these decoders to make your objects. ```java import dev.mccue.json.*; import java.util.List; record Actor(String name, String role, boolean muppet) { static Actor fromJson(Json json) { return new Actor( JsonDecoder.field(json, "name", JsonDecoder::string), JsonDecoder.field(json, "role", JsonDecoder::string), JsonDecoder.optionalField( json, "muppet", JsonDecoder::boolean_, true ) ); } } record Movie(String title, List<Actor> cast) { static Movie fromJson(Json json) { return new Movie( JsonDecoder.field(json, "title", JsonDecoder::string), JsonDecoder.field( json, "cast", JsonDecoder.array(Actor::fromJson) ) ); } } public class Main { public static void main(String[] args) { var json = Json.readString(""" { "title": "Treasure Island", "cast": [ { "name": "Kermit", "role": "The Captain" }, { "name": "Tim Curry", "role": "Long John Silver", "muppet": false } ] } """); var movie = Movie.fromJson(json); System.out.println(movie); } } ``` ### Full Round-Trip With all of that out of the way, here is how you might define a model, write it to json, and read it back in. ```java import dev.mccue.json.*; import java.util.List; record Actor(String name, String role, boolean muppet) implements JsonEncodable { static Actor fromJson(Json json) { return new Actor( JsonDecoder.field(json, "name", JsonDecoder::string), JsonDecoder.field(json, "role", JsonDecoder::string), JsonDecoder.optionalField( json, "muppet", JsonDecoder::boolean_, true) ); } @Override public Json toJson() { return Json.objectBuilder() .put("name", name) .put("role", role) .put("muppet", muppet) .build(); } } record Movie(String title, List<Actor> cast) implements JsonEncodable { static Movie fromJson(Json json) { return new Movie( JsonDecoder.field(json, "title", JsonDecoder::string), JsonDecoder.field( json, "cast", JsonDecoder.array(Actor::fromJson) ) ); } @Override public Json toJson() { return Json.objectBuilder() .put("title", title) .put("cast", cast) .build(); } } public class Main { public static void main(String[] args) { var json = Json.readString(""" { "title": "Treasure Island", "cast": [ { "name": "Kermit", "role": "The Captain", "muppet": true }, { "name": "Tim Curry", "role": "Long John Silver", "muppet": false } ] } """); var movie = Movie.fromJson(json); var roundTrippedJson = Json.readString( Json.writeString(movie.toJson()) ); var roundTrippedMovie = Movie.fromJson(roundTrippedJson); System.out.println( json.equals(roundTrippedJson) ); System.out.println( movie.equals(roundTrippedMovie) ); } } ``` ## Pitches My hope is that at this point you have a sense of how it might look to use this library for your projects. The rest of the post will just be some pitches to try to push you into the dark side. ### It is not magic Some people are perfectly fine with jackson-databind, gson, and other frameworks which use a class as a schema to read in JSON. Others seem not to be. Kvetching about annotations and frameworks that make use of annotations is a common past-time in the community. The current options for decoding without databind kinda suck though. To highlight this - I was talking with someone who takes the "magic bad" position. They said that generally they just use gson and manually construct their objects. I challenged them to interpret this JSON into classes using their usual method. ```json { "title": "Treasure Island", "cast": [ { "name": "kermit" }, { "name": "gonzo" }, { "name": "rizzo" } ] } ``` And the following is the code they came up with. ```java package example.gson; import com.google.gson.JsonObject; public record Muppet(String name) { public static Muppet createFrom(JsonObject muppetObject) { String name = muppetObject .get("name") .getAsString(); return new Muppet(name); } } ``` ```java package example.gson; import java.util.ArrayList; import java.util.List; import com.google.gson.JsonArray; import com.google.gson.JsonObject; public record Movie(String title, List<Muppet> cast) { public static Movie createFrom(JsonObject object) { String muppetTitle = object .get("title") .getAsString(); List<Muppet> cast = new ArrayList<>(); JsonArray castArray = object.getAsJsonArray("cast"); for (int i = 0; i < castArray.size(); i++) { JsonObject muppetObject = castArray .get(i) .getAsJsonObject(); cast.add(Muppet.createFrom(muppetObject)); } return new Movie(muppetTitle, cast); } } ``` ```java package example.gson; import com.google.gson.JsonObject; import com.google.gson.JsonParser; public class Example { public static void main(String[] args) { String content = "{\r\n" + " \"title\": \"Treasure Island\",\r\n" + " \"cast\": [\r\n" + " {\r\n" + " \"name\": \"kermit\"\r\n" + " },\r\n" + " {\r\n" + " \"name\": \"gonzo\"\r\n" + " },\r\n" + " {\r\n" + " \"name\": \"rizzo\"\r\n" + " }\r\n" + " ]\r\n" + "}"; JsonObject json = JsonParser .parseString(content) .getAsJsonObject(); Movie movie = Movie.createFrom(json); System.out.println(movie); } } ``` I think this code is pretty representative of the variety one would produce when working against this sort of API. The follow-up challenge I gave him was to run this code against some malformed input. ```json { "title": "Treasure Island", "cast": [ { }, { "name": "gonzo" }, { "name": "rizzo" } ] } ``` The error message that his code produced was the following. ``` Cannot invoke "com.google.gson.JsonElement.getAsString()" because the return value of "com.google.gson.JsonObject.get(String)" is null ``` Which, while better than it would have been in years past (thanks [JEP 358](https://openjdk.org/jeps/358)), still isn't amazing. Compare that to the error message someone will get with what is the most natural way to express this with my library. ```java record Muppet(String name) { static Muppet fromJson(Json json) { return new Muppet( JsonDecoder.field(json, "name", JsonDecoder::string) ); } } record Movie(String title, List<Muppet> cast) { static Movie fromJson(Json json) { return new Movie( JsonDecoder.field( json, "title", JsonDecoder::string ), JsonDecoder.field( json, "cast", JsonDecoder.array(Muppet::fromJson) ) ); } } ``` ``` Problem with the value at json.cast[0].name {} no value for field ``` The code they produced is also pretty heavily "imperative." To make their list of Muppets they have a plain for loop and transform every element individually. ```java List<Muppet> cast = new ArrayList<>(); JsonArray castArray = object.getAsJsonArray("cast"); for (int i = 0; i < castArray.size(); i++) { JsonObject muppetObject = castArray .get(i) .getAsJsonObject(); cast.add(Muppet.createFrom(muppetObject)); } ``` This is not intrinsically bad by any means, for loops are not evil, but all code lives somewhere on a spectrum from "declarative" to "imperative". Describing what should be done versus how it should be done. If you compare their code to what one would write when relying on gson's reflection mechanisms the difference is stark. ```java record Muppet(String name) {} record Movie(String title, List<Muppet> cast) {} ``` Yes, you need to know the rules for how JSON is automatically mapped to these structures and what different annotations mean if they are present. But this is unquestionably a "declarative schema." If you know the rules it is easier to read. The code you would write with my library occupies a middle ground. ```java record Muppet(String name) { static Muppet fromJson(Json json) { return new Muppet( JsonDecoder.field(json, "name", JsonDecoder::string) ); } } record Movie(String title, List<Muppet> cast) { static Movie fromJson(Json json) { return new Movie( JsonDecoder.field( json, "title", JsonDecoder::string ), JsonDecoder.field( json, "cast", JsonDecoder.array(Muppet::fromJson) ) ); } } ``` While there is more of it than when you rely on heuristics, it is still "reasonably declarative." ```java return new Movie( JsonDecoder.field( json, "title", JsonDecoder::string ), JsonDecoder.field( json, "cast", JsonDecoder.array(Muppet::fromJson) ) ); ``` The logic for it is both extensible (there is nothing privileged about `JsonDecoder`; you can write your own helper methods) and defined in code that you can click-to-definition to. If a field can be null you would see `nullableField`. If a field can be missing, you would see `optionalField`. If it could be both, you would see `optionalNullableField`. ### This is simple to teach I don't know about you, but I am absolutely sick of explaining Jackson to students who are still struggling with classes in general. If a student has JSON like this ```java [ {"name": "kailee"}, {"name": "fran"} ] ``` Then to read it in as a `List<Person>` they need to either * target a Person[] * target a class which extends ArrayList<Person> * provide a TypeToken<List<Person>> Plus maybe some other options I might be unaware of. To actually understand what they are doing for just that, they need to have a sense for some combination of * Inheritance * Generic Erasure * Reflection And that is really hard to impart at their stage, so often us online helpers just say "ah, make a class that looks like this." and send them on their way. On the flip-side, when beginners get frustrated with a databind approach and fall back to something like [org.json](https://github.com/stleary/JSON-java) they seem to produce some absolute monstrosities before they come back with another question. It's not their fault, they learned loops at most a semester ago, but it does present some practical problems. The tension is between giving an option that there is enough time to teach the mechanics of and teaching an approach that will be ergonomic enough for them to complete their assignments. I've been testing early drafts of this library against real students and, while there are too many confounding variables to say I've done any real science, I've found it to be far easier. When a student needs a quick monkey-see-monkey-do, the `JsonDecoder.field` pattern seems to work just fine. When a student wants, needs, or has time for a full explanation there is a far shorter distance between where they are and where I need to get them. I just need to make sure they understand interfaces and lambdas then they are ready for some version of the tutorial I gave in the first section. Students in college aren't the only people who need to be taught how to work with JSON in Java either. If you work for a company that hires juniors or folks who come from different language backgrounds, then there has to be an education step. It might be worth the boilerplate of writing out `fromJson` and `toJson` to have a codebase where onboarding doesn't need to touch the "advanced" side of Java to send JSON over the wire. ### It could help Java get an official JSON library There has been a JEP open for years which [proposes adding JSON to the standard library](https://openjdk.org/jeps/198) If you use a build tool like maven for all of your programs, it might not seem important. You can pull in Jackson, gson, org.json, this, or any library with one declaration. There are a few things which make me care about it though 1. Right now, the only data format built in the Java is XML. That's not exactly the king of language neutral formats it was in the 90s. 2. Java now supports [single file programs](https://openjdk.org/jeps/330) and will eventually support ["terse" main methods](https://openjdk.org/jeps/8302326). As the applicability for scripting goes up, the lack of the ability to use JSON hurts more since that is generally a "no-dependency" situation. 3. Whatever is in the standard library has the power to affect defaults. Databind as the default for the ecosystem feels like it has too much momentum to change otherwise. 4. When [integrating Graal](https://www.graalvm.org/2022/openjdk-announcement/) there are going to be components, like the `reflection-config.json` file it uses for [native image](https://www.graalvm.org/22.0/reference-manual/native-image/), that will read and write JSON. I fear that will be too tempting a target for `--add-opens`. Regardless of if you agree that support should be in the standard library, I think the previous section illustrates some of the problems that could come if one of the existing APIs were adopted. ```java List<Muppet> cast = new ArrayList<>(); JsonArray castArray = object.getAsJsonArray("cast"); for (int i = 0; i < castArray.size(); i++) { JsonObject muppetObject = castArray .get(i) .getAsJsonObject(); cast.add(Muppet.createFrom(muppetObject)); } ``` Regardless of applicability, availability could end up making this the default. That is unideal. An option to avoid this is for the standard library to add direct support for databind. Not only was that ruled out by the existing JEP, it would probably just be a bad idea. Mapping the JSON data model to the wide universe of Java objects has solutions that occupy a very large design space. Considering the long term commitments the JDK makes whenever it adds a new feature as well as the mental, physical, and emotional damage dealt by its existing `Serializable` mechanism - I don't see that happening. If the JDK gave up and just provided a low-level streaming parser akin to [`jackson-core`](https://github.com/FasterXML/jackson-core), then it wouldn't affect the defaults in the ecosystem that much, but it would raise the question of "why not just use [`jackson-core`](https://github.com/FasterXML/jackson-core)." In addition, users would still have to add a library to accomplish most tasks. There wouldn't be much of a benefit. So that's where this library comes in. It's nowhere near seaworthy for that ocean, but the `JsonDecoder` approach is relatively novel in the JVM ecosystem. The mechanisms needed to make it "work" have only been around since Java 8 and, as far as I know, haven't been tested on any large scale. The more folks try it, or write libraries that do the concept "but better", or socialize it, etc. the more confidence there can be in whether a decoder based API would be applicable to the needs of the JDK. You can see my recent conversation on the mailing list about this [here](https://mail.openjdk.org/pipermail/core-libs-dev/2022-December/097949.html). ### You can play with new features Maybe I haven't convinced you to give it a try for your work or personal projects. That's fine. Still, it is a small codebase. It makes (I think) good use of the features added to Java in the last decade. If you aren't caught up it might be a good reference point to do so. If you want to play with upcoming features like [value classes](https://openjdk.org/jeps/8277163) or [string templates](https://openjdk.org/jeps/430) it could be a nice playground to see how that would affect performance, design, or just how the code feels. In particular, JSON is mentioned as a use-case in the [JEPs](https://openjdk.org/jeps/430) and [explanations for new features](https://www.infoq.com/articles/data-oriented-programming-java/) often. Could be nice to have a JSON api to point to that actually works in the way being described. Sun, 26 Feb 0023 05:00:00 +0000Development Perils: How to not create a mobile applicationhttps://mccue.dev/pages/12-28-22-development-perils-how-to-not-create-a-mobile-application Ever since Forus Labs' first mobile application, [TimeBloc](https://timebloc.foruslabs.com/), was acquired in September 2020, I've mused about writing a short postmortem on its less-than-stellar development. Perhaps as a conclusion to the first chapter in our software engineering careers. I hesitated each time, unsure of how to concisely fit everything into an article. It's almost 2023. Enough time has passed that memories of that period are becoming hazy. I can't hesitate any longer. <figure> <img src="/pages/12-28-22-timebloc-development.jpeg" alt=""/> <figcaption>Rare photograph of TimeBloc's development circa 2019</figcaption> </figure> Those seeking groundbreaking insights into software engineering should stop reading. This article just describes the aftermath of ignoring practices beaten to death by others. So gather around the fireplace, as I tell a tale of poor software engineering decisions; of how to _not_ create a mobile application. ## Choose Wisely Our tale begins in early 2019. Three lads, fresh out of polytechnic (high-school equivalent), had an overabundance of time before embarking on their [compulsory service](https://en.wikipedia.org/wiki/National_service_in_Singapore). They assumed creating a mobile application to be entertaining and straightforward affair. However, none of them had any prior professional experience creating mobile applications. You can probably sense where this is heading. During meetings at their local Starbucks, they pitched wild and fantastical features to include in their time-blocking application. One of those features was real-time synchronization of all the user's data, i.e. time-blocks and settings, across their devices. Debates on whether even _that_ was too fantastical continued perpetually until a compromise was sought, deferring the feature to a subsequent release. Unknowingly, the three lads had steered the project away from certain doom. Deferring the feature was one of the few mistakes _avoided_. It was only discovered to be fraught with difficulty after implementing a similar feature in a subsequent application. <figure> <img src="/pages/12-28-22-multileader-replication.png" alt=""/> </figure> In essence, it was a distributed computing problem. Users could concurrently modify and sync data across several of their devices. To further complicate matters, the application had to be offline-first. That is to say, the application must work even when unconnected to the internet. Modifications had to be reconciled and propagated as they arrived piecemeal. Think _"Multi-Leader Replication on Steroids"_. Had the three lads stubbornly insisted on real-time synchronization of data, TimeBloc would remain vaporware until today. A project's features is quite literally make-or-break. Moral of the story is, _choose wisely what to implement, err on the side of caution and do not implement something if in doubt_, [KISS](https://en.wikipedia.org/wiki/KISS_principle). Likewise, don't implement offline-first support and data synchronization together. It's difficult. ## Ecological Survey A few weeks passed in the blink of an eye. The three lads had finished bike-shedding the application's initial features. Said features remained tame, devoid of those too deemed outlandish. Before development commenced, one question still remained. Which language and framework do they use? <figure> <img src="/pages/12-28-22-seaport-at-sunset.jpeg" alt=""/> <figcaption>"Sea port at sunset" - Claude Lorrain, 1639</figcaption> </figure> The three lads found themselves at a port seeking passage across a perilous, sprawling ocean. Once aboard, it was nigh impossible to switch ships mid-voyage. Moored close to shore were two colossal ships, native Android and iOS development, surrounded by flocks of passengers awaiting embarkment. Both ships were remarkably popular, their seaworthiness trialed-and-tested by time. Moored further down the pier was React Native. Despite having been built later, it had proven to be seaworthy and attracted a respectable crowd. Lastly, there was Flutter, a brand-new ship yet to sail its maiden voyage. It incorporated the latest advancements in shipbuilding and was surrounded by crowds on the dock. Nevertheless, few in those crowds were actual passengers. Lacking the manpower and funding to develop two separate applications, native Android and iOS development were out of the picture. Both had a single, different destination. However, our funds afforded us passage to only one. Yet, we sought to visit both destinations. Thus, the only contenders were cross-platform frameworks like Flutter and React Native. After brief experimentation and poring over documentation, Flutter was chosen. Unbeknown to us was the importance of conducting a thorough ecological survey. That is to say, smitten by the ship's advanced exterior, we forgot to check if the ship's interior was even furnished. Flutter in 2019 isn't Flutter in 2022. It was still in its infancy. Likewise, the community and open-source ecosystem surrounding the framework was still budding. It was only discovered partway through development that there was no support for [Lottie](https://airbnb.io/lottie/#/) animations. <figure> <img src="/pages/12-28-22-lottie-animation.gif" alt=""/> <figcaption>Sample Lottie animation</figcaption> </figure> Although [Rive](https://rive.app/) was supported, good luck convincing any freelance designer to create an animation in that format. Stuck between a rock and a hard place, the difficult decision was eventually made to scrap all animations. Some other memorable issues included the notification scheduling library not accounting for Daylight saving time, and the SQLite Flutter library not supporting desktop environments. The latter meant unit tests depending on SQLite couldn't be ran outside an Android/iOS emulator. It greatly influenced the decision to skip unit tests covered in the next section. Because of its recency, Flutter's community had yet to take root. This manifested as less publicly available information owning to the lack of grey-haired Gandalf-types that thrived on other platforms. Consequentially, that led to greater difficulty with debugging and troubleshooting problems. One particularly nasty incident occurred after integrating background notification scheduling. In production, reports that the application crashed during start-up began coming in. Further examination revealed that it only affected iPhone 8 devices running a certain iOS version. To complicate matters, the issue could not be replicated on an emulator nor did we own an iPhone 8 running that iOS version. An entire weekend was spent frantically debugging the issue, scouring the internet for any hints to no avail. Desperate, the decision was made to remove background notification scheduling altogether in an emergency patch. Developing any non-trivial piece of software will inevitably require features offered beyond the language or framework. It is often the surrounding community that provides those missing pieces. Reusing the ship analogy, embarking on a ship guarantees passage but not comfort. The moral of the story is, _always conduct an ecological survey on the surrounding open-source ecosystem and community when deciding on a language/framework_. ## Test Now Yet in another blink of an eye, a few months have passed. Our three lads found themselves wading knee-deep in development work. Things had progressed slower than anticipated while the looming deadline drew close. The metaphorical ship's pace had to be tightened. To lighten the ship, the three lads tossed the lifeboats overboard. They reasoned that the ship wasn't on fire, and the lifeboats could be retrieved if it did, or at the end of the voyage. Long story short, they didn't. The lifeboats remain lost at seas till today. Skipping unit testing was controversial. Although we acknowledged it to be potentially disastrous, the motivations seemed rationale. Unit tests benefited maintenance in the long term. However, there wasn't going to be a long term if the application missed the initial deadline. Tests could always be added once things have stabilized. In the interim, manual testing should suffice. _It couldn't be that bad_. In short, test later gradually became test never. Things _could_ be that bad. Manual testing was time-consuming and unreliable in a constant development flux. That meant manual tests gradually subsided too, while developer confidence plummeted. Eventually, manual testing was only conducted when gluing the UI and business logic together. The application was built using a pseudo-BLoC architecture composed of several layers. Each developer tackled a single layer in isolation. Contrary to the adage of "_integrating often and early_", integration only commenced once all layers were individually completed. It was neither often nor early. Skipping tests and delaying integration was a potent combination. It halted progress and development manifested into the nine circles of hell. It was only discovered during integration that each layer behaved contrary to the other developers' expectations. Similar to the Tower of Babel, further examination revealed contrasting interpretations of each layer's supposed behaviour. To remedy the issue, several bootleg modifications were applied over the span of a day, further damaging the application's structure. <figure> <img src="/pages/12-28-22-debugging-timebloc.gif" alt=""/> <figcaption>Rare photo of developer debugging TimeBloc, circa 2019</figcaption> </figure> To worsen matters, every imaginable bug surfaced in swarms during manual testing. The application would spontaneously crash and data would become corrupted seemingly at random. Since each individual layer wasn't tested, identifying and isolating the root causes became miniature D&D campaigns. A bug could be caused by the UI, persistence layer, and everything in between. Speaking from personal experience, nothing is as soul-draining as reaching work at 10am and debugging bugs until 4am in the morning. In the end, although the application barely met the looming deadline, the decision to forego unit testing turned the application into a "[Haunted Graveyard](https://www.usenix.org/sites/default/files/conference/protected-files/srecon17americas_slides_reese.pdf)" during its lifetime. Future development stalled. Features couldn't be added and existing bugs couldn't be fully stamped-out. Because of that, rewriting the application was under consideration shortly before the application was acquired. We failed to acknowledge the immediate maintenance benefits of unit tests. The time spent performing manual testing surpassed the predicted time writing equivalent unit tests by a few folds. Notwithstanding the time spent debugging nor the toll on developers' morale. Similarly, integrating changes late increased the cost of debugging and modification substantially. This combination forced us to cut features and postpone our plans to implement monetization in the initial release. Shedding tests to quicken velocity is almost always counterproductive. _Test now before it becomes test never_. That goes both for writing unit tests and integrating early. See [Chapter 11 of Software Engineering at Google](https://abseil.io/resources/swe-book/html/ch11.html) for a more in-depth treatise of the subject. ## Perils Following the previous sections, leftover material still remains. None of which substantial enough to dedicate an entire section to. Listed below in no particular order, are perils encountered during development. * Bundled SQLite versions may be ancient. Ensure that all SQLite features used are supported on all target platforms. * Foreign keys aren't enabled by default in SQLite. Always enable foreign keys via `PRAGMA_foreign_key = ON`. * Offline-first & data synchronization aren't a simple combination. Be prepared for distributed computing problems. * It is trivial to mangle time zones. Be careful when using Dart's lackluster DateTime class. See [Falsehoods programmers believe about time zones](https://www.zainrizvi.io/blog/falsehoods-programmers-believe-about-time-zones/). * Don't publish new versions before the weekends/holidays. You might wind up spending that time debugging issues in production. * Document everything. Trying to understand undocumented spaghetti code you wrote 1 year ago is difficult. ## Final Thoughts Our first foray into the world of professional software engineering wasn't glamorous. Nevertheless, it still represented a significant step forward. Although plenty of lessons were learnt through blood, toil, tears and sweat, I'm glad to be able to sit here and laugh at our own foolish mistakes in hindsight. Likewise, I hope you had a chuckle at the sheer madness even if you didn't take away anything else. ## TL;DR * _Choose wisely what to implement, err on the side of caution and do not implement something if in doubt._ * _Always conduct an ecological survey on the surrounding open source space & community when deciding on a language/framework._ * _Test now before it becomes test never._ --- Article was originally published on [Medium](https://matthiasngeo.medium.com/development-perils-how-to-not-create-a-mobile-application-bda101009438).Wed, 28 Dec 0022 05:00:00 +0000How to Structure a Clojure Web App 101https://mccue.dev/pages/12-7-22-clojure-web-primer <meta property="og:image" content="https://mccue.dev/pages/12-7-22-bobs.png" /> <meta name="twitter:image" content="https://mccue.dev/pages/12-7-22-bobs.png"> At work, we use [integrant](https://github.com/weavejester/integrant) to manage stateful components in our Clojure apps. It has been fine, but it's a constant struggle to explain it. From a purely mechanical perspective there is a lot to teach. It uses multimethods to register lifecycle hooks, idiomatic use demands namespaced keywords, and in testing we've needed to incorporate [special libraries](https://github.com/RickMoynihan/redef-methods). None of that is fundamentally a problem though. All the libraries which do this sort of thing use _some_ weirder part of Clojure's arsenal. For [component](https://github.com/stuartsierra/component) it is records and protocols. For [clip](https://github.com/juxt/clip) it is namespaced symbols and dynamic lookup. For [donut](https://github.com/donut-party/system) it's a secret, more complex third thing. What has been a challenge is explaining what exactly it is that these libraries _do_. Doing that - really doing that - requires a mountain of shared context that folks simply do not have. <a href="https://www.youtube.com/watch?v=m4OvQIGDg4I"><img src="/pages/12-7-22-bobs.png" alt="What would you say you... do here?"/></a> This article is an attempt to convey some of that shared context. Apologies if it gets a bit ranty. ## Ring This is an HTTP Request. ``` GET /echo HTTP/1.1 Host: mccue.dev Accept: text/html ``` This is an HTTP Response. ```http HTTP/1.1 200 OK Content-Type: application/json Content-Length: 19 {"success":"true"} ``` Basically the entire Clojure world has agreed to a [specification called "ring"](https://github.com/ring-clojure/ring/blob/master/SPEC) which says how these requests and responses translate to data structures in Clojure. Clojure web servers are functions that take "ring requests" which look like the following ```clojure {:uri "/echo" :request-method :get :headers {} :body ... :protocol "HTTP/1.1" :remote-addr "127.0.0.1" :server-port 80 :content-length nil :query-string nil :scheme :http} ``` and produce "ring responses" which look like this. ```clojure {:status 200 :headers {"Content-Type" "application/json"} :body "{\"success\":\"true\"}"} ``` Everything else - [routing](https://github.com/metosin/reitit), [authentication](https://github.com/funcool/buddy-auth), [middleware](https://github.com/ring-clojure/ring-json) - is built upon this foundation. ```clojure (ns example (:require [ring.adapter.jetty :as jetty])) (defn handler [request] (cond (= (:uri request) "/hello") {:status 200 :body "Hello, World"} :else {:status 404})) (defn start-server [] (jetty/run-server handler {:port 1234})) ``` So this code, as written, will run a [Jetty](https://www.eclipse.org/jetty/) server which responds to all requests to `/hello` with `Hello, World` and all other requests with a `404`. ## The REPL One issue that is already relevant with preceding example, and will be a common theme going forward, is "REPL Friendliness." Clojure and [other Lisps](https://www.youtube.com/watch?v=Y0LUZ7gbWbk) have the unique property that the "unit" of code isn't a file, but instead an individual "form." As an example, with Python you cannot run the following code. ```python print("Start") 3di92d93209032 ``` You will get a syntax error on the third line and nothing with run. ``` File "/Users/emccue/Development/posts/example.py", line 3 3di92d93209032 ^ SyntaxError: invalid decimal literal ``` The equivalent Clojure looks like this. ```clojure (println "Start") 903f903jf939cn34f934fj9j39f4 ``` Unlike with the Python example, the very first `println` will actually run before a crash. ``` Start Syntax error reading source at (example.clj:4:0). Invalid number: 903f903jf939cn34f934fj9j39f4 ``` The reason for this is that the Clojure reader will evaluate each "form" one at a time. There is no full pass of the file before running code. This enables a workflow where a developer has a file open in one window with the full contents of their code and another window open at the same time with their "live" program - the "REPL". Through editor magic, a developer can then load new code one form at a time into the live program. If in doing so a function is redefined, then the new definition of the function will start to be used. There are many other explanations for [this mechanism](https://clojure.org/guides/repl/introduction) and [the workflow it enables](https://clojure.org/guides/repl/enhancing_your_repl_workflow) online. So with that context, what is "not REPL friendly" about the example server code? ```clojure (defn handler [request] (cond (= (:uri request) "/hello") {:status 200 :body "Hello, World"} :else {:status 404})) ``` Assuming that first we load the `handler` function, we will next load the `start-server` function. ```clojure (defn start-server [] (jetty/run-server handler {:port 1234 :join? false})) ``` And some code will eventually call it to start the server ```clojure (start-server) ``` At this point, a developer might want to modify the `handler` function to respond to requests on the `/marco` route. ```clojure (defn handler [request] (cond (= (:uri request) "/hello") {:status 200 :body "Hello, World"} (= (:uri request) "/marco") {:status 200 :body "POLO!"} :else {:status 404})) ``` If they did this and tried making a request to `/marco`, the server would still respond with a `404`. The reason for this is that whenever `start-server` is called it will be passed the current "value" backing the `handler` function. Future updates won't be picked up unless the server is stopped and restarted. This is pretty trivially side-steppable by using some "indirection" mechanisms. ```clojure (defn start-server [] (jetty/run-server #'handler {:port 1234 :join? false})) ``` In this case, putting the `#'` in front of `handler` makes it so that whenever it is called the [current value of the `handler` function will be used](https://clojure.org/reference/vars). If a developer were to re-load a new definition of `handler` into the REPL it would be immediately picked up and used. This is what REPL friendly code looks like. It makes it easier for a developer to have changes picked up on the fly in a running program and rapidly experiment with new things. There are other associated techniques like leaving a comment at the bottom of a file with code only intended to be used with the REPL. ```clojure (ns example (:require [ring.adapter.jetty :as jetty])) (defn handler [request] ...) (defn start-server [] ...) ;; The Server will not start automatically, but a dev ;; can conveniently start it by putting their cursor in ;; the comment and loading the call into the repl (comment (start-server)) ``` ## Global Stateful Resources Of course, most web apps are not written entirely in a single function. The most natural point at which to split out logic tends to be at handlers for different paths. ```clojure (ns example (:require [ring.adapter.jetty :as jetty])) (defn hello-handler [request] {:status 200 :body "Hello, World"}) (defn marco-handler [request] {:status 200 :body "POLO!"}) (defn handler [request] (cond (= (:uri request) "/hello") (hello-handler request) (= (:uri request) "/marco") (marco-handler request) :else {:status 404})) (defn start-server [] (jetty/run-server #'handler {:port 1234 :join? false})) (comment (start-server)) ``` And of course the actual declarations of the routes can be separated from the code that starts the server, but that would get hard to follow here. At this point most of the code is fairly easy to test. You just make fake requests, pass them to the handlers, and check that the responses are what you expect. ```clojure (ns example-test (:require [clojure.test :as t] [example])) (t/deftest handler-test (t/testing "Request to /hello gets Hello, World" (let [response (example/handler {:uri "/hello"})] (t/is (= (:status response) 200)) (t/is (= (:body response) "Hello, World")))) (t/testing "Request to /marco gets POLO!" (let [response (example/handler {:uri "/marco"})] (t/is (= (:status response) 200)) (t/is (= (:body response) "POLO!")))) (t/testing "Request to unknown path gets 404" (let [response (example/handler {:uri "/jdkdawdoaddwadad"})] (t/is (= (:status response) 404))))) ``` This is a cool property of the overall ring model. [You can directly test handlers without having to actually spin up a server](https://github.com/ring-clojure/ring-mock). No [real programs](https://www.haskell.org/) can ever stay easy to test pure functions. Handling a request often implies the need for dependence on some "stateful resources" such as external services and connection pools. ### External Services As an example, lets say when you make a request to `/marco` we still want to respond with `POLO!`, but if the user specifies that they are not in a pool with a query string `/marco?nopool` then we want to respond with the entire Wikipedia page for Marco Polo. ```clojure (defn marco-handler [request] (if (= (:query-string request) "nopool") {:status 200 :body (slurp "https://en.wikipedia.org/wiki/Marco_Polo")} {:status 200 :body "POLO!"})) ``` While we can still test this conveniently, the test will have an implicit dependence on Wikipedia being online. It also makes our tests slower than they need to be since we are making an actual http call. ```clojure (ns example-test (:require [clojure.string :as string] [clojure.test :as t] [example])) (t/deftest marco-handler-test (t/testing "Request to /marco gets POLO!" (let [response (example/marco-handler {:uri "/marco"})] (t/is (= (:status response) 200)) (t/is (= (:body response) "POLO!")))) (t/testing "Request to /marco with no pool gets info" (let [response (example/marco-handler {:uri "/marco" :query-string "nopool"})] (t/is (= (:status response) 200)) (t/is (string/includes? (:body response) "The Travels of Marco Polo"))))) ``` This isn't ideal, but it could be worse. Imagine if you wanted to alert an admin every time the `/hello` route was called. A bit of a silly example, but calls to APIs like [Sendgrid](https://sendgrid.com/) aren't unreasonable to do in response to some requests. ```clojure (defn hello-handler [request] (sendgrid/send-email "admin@website.com" "You got a user!") {:status 200 :body "Hello, World"}) ``` As written, this is a doozy to test. Either you * Make sure you only have test credentials loaded when running your unit tests. Don't mess it up! * Make sure you have no credentials loaded when running your unit tests. You now need to be extra cautious that you are okay with calls being made that will always fail. * Stub out the functions that call out to external services with a mechanism like `with-redefs`. The problem with the last solution, even though it does mechanically solve the issue, is that you need to know what external services a piece of code will use. Since our handlers are just taking a request, there is not enough information at call-sites or in the function header to say for sure. ```clojure (defn hello-handler [request] ;; Have to read every function this calls ;; to see what stateful stuff is going on... (some-other-code request)) ``` So tests end up looking like the following, with pretty low confidence that everything has been stubbed out. ```clojure (with-redefs [sendgrid/send-email (constantly nil)] (t/testing ... ACTUAL TEST ...)) ``` ### Connection Pools Handlers also very often need to talk to a database. It is wasteful to make a new database connection on every request, so a really common technique is to keep a certain number of connections alive in a "pool" and re-use them over and over again. What is common, and saddening, to find is a connection pool stored in a top-level constant and referenced by a large part of the codebase. ```clojure (ns example.db (:import (com.zaxxer.hikari HikariConfig HikariDataSource))) (def pool (HikariDataSource. (doto (HikariConfig.) (.setJdbcUrl "...")))) ``` ```clojure (defn hello-handler [request] ;; Information like this can come from middleware. (let [user-id (:user-id request) user-name (jdbc/execute-one! db/pool ["SELECT name FROM user WHERE user.user_id = ?" user-id])] {:status 200 :body (str "Hello, " user-name)})) ``` Even assuming that, [like DHH](https://dhh.dk/2014/slow-database-test-fallacy.html), you are fine with your tests hitting a real database this still creates some practical problems. For one, if you edit the file where the connection is defined you might accidentally reload the constant and leak a bunch of connections. This isn't the most likely on a large project where you aren't touching this code that often, but over the course of a [long lived REPL session](https://www.youtube.com/watch?v=gIoadGfm5T8) it can be an annoying. But also it is annoying logistically that the connection pool is established immediately when the code is loaded. If you Ahead-of-Time compile your Clojure code then you will pretty immediately want that to not be the case. You can sidestep that last issue by putting the connection pool behind a "[delay](https://clojuredocs.org/clojure.core/delay)", which lazily starts the connection pool when it is needed. ```clojure (ns example.db) (def pool (delay (HikariDataSource. (doto (HikariConfig.) (.setJdbcUrl "..."))))) ``` But now this detail changes how users have to access the actual pool. Usage sites have to add an `@` to make sure the pool has been started and to retrieve it. ```clojure (defn hello-handler [request] (let [user-id (:user-id request) user-name (jdbc/execute-one! @db/pool ["SELECT name FROM user WHERE user.user_id = ?" user-id])] {:status 200 :body (str "Hello, " user-name)})) ``` Annoying, but that's not all. if you want to sub out the pool in a test fixture and maybe run tests in parallel then the whole pool needs to be dynamically re-bindable as well. ```clojure (def ^:dynamic *pool* (delay (HikariDataSource. (doto (HikariConfig.) (.setJdbcUrl "..."))))) ``` ```clojure (defn hello-handler [request] (let [user-id (:user-id request) user-name (jdbc/execute-one! @db/*pool* ["SELECT name FROM user WHERE user.user_id = ?" user-id])] {:status 200 :body (str "Hello, " user-name)})) ``` ```clojure (binding [db/*pool* (delay (make-test-pool))] (insert-user 123 "bob") (let [response (hello-handler {:user-id 123})] (t/is (= (:body response) "Hello, bob")))) ``` All of that is workable - you can use macros and helper functions to alleviate the syntax ugliness and generally speaking your app _will_ just have one database. But it also is not that uncommon for an app to have two databases. Usually one SQL and one Redis-like. And while it's not as hard as for arbitrary external services - you still don't really know from a call-site whether you need to establish a test database before calling it in a test. ## Inversion of Control The general shape of the solution to those problems is to not have "global" stateful resources. For external services, this means making an actual object to pass as the first argument to calls. If the service is like Sendgrid, this could be a convenient place to put information like your API key or make a [persistent http client](https://github.com/gnarroway/hato#building-a-client). ```clojure (defn make-sendgrid-client [api-key] {:api-key api-key :client (hato/build-http-client {:connect-timeout 10000 :redirect-policy :always})}) (defn send-email [sendgrid-client] (hato/post (:client sendgrid-client) "/send-email")) ``` But even if the service is "stupid" and requires no authentication or special treatment like Wikipedia, there is still value. ```clojure (defn make-wikipedia-client [] ;; Nothing really to put... {:name "Wikipedia Client"}) (defn get-marco-polo-info [wikipedia-client] (slurp "https://en.wikipedia.org/wiki/Marco_Polo")) ``` The value being in the fact that having _something_ as a first argument means that later on you have the ability to refactor calls to be behind some dispatch mechanism like a protocol. ```clojure (defprotocol WikipediaClient (get-marco-polo-info [_])) (defn make-wikipedia-client [] (reify WikipediaClient (get-marco-polo-info [_] (slurp "https://en.wikipedia.org/wiki/Marco_Polo")))) ``` Which in turn can enable creating "fake" implementations for testing. ```clojure (def fake-wikipedia (reify WikipediaClient (get-marco-polo-info [_] "was a dude, i guess?"))) ``` For connection pools, there is already an actual object to pass so that isn't an issue. The same "maybe make it a protocol later" strategy is applicable to that sort of resource as well. Then in all the code that wants these dependencies, just expect them to be given as arguments. ```clojure (defn marco-handler [wikipedia-client request] (if (= (:query-string request) "nopool") {:status 200 :body (wikipedia/get-marco-polo-info wikipedia-client)} {:status 200 :body "POLO!"})) ``` Which provides a clear path to sensible testing. ```clojure (ns example-test (:require [clojure.string :as string] [clojure.test :as t] [example])) (t/deftest marco-handler-test (let [mock-wikipedia (reify WikipediaClient (get-marco-polo-info [_] "INFO"))] (t/testing "Request to /marco gets POLO!" (let [response (example/marco-handler mock-wikipedia {:uri "/marco"})] (t/is (= (:status response) 200)) (t/is (= (:body response) "POLO!")))) (t/testing "Request to /marco with no pool gets info" (let [response (example/marco-handler mock-wikipedia {:uri "/marco" :query-string "nopool"})] (t/is (= (:status response) 200)) (t/is (= (:body response) "INFO"))))) ``` This technique - where we get dependencies as arguments instead of making them locally or getting them from some global place - is commonly called "[Inversion of Control](https://martinfowler.com/bliki/InversionOfControl.html)." ## Dependency Injection and "The System" While this is a concrete improvement - we can directly see what the dependencies of a process are in the argument list - there are still some unresolved issues. Let's say our `hello-handler` wants to use a `sendgrid-service` and the database `pool` and our `marco-handler` wants to use a `wikipedia-service` and the database `pool`. ```clojure (defn hello-handler [sendgrid-service pool request] ...) (defn marco-handler [wikipedia-service pool request] ...) ``` This implies that the root `handler` function will have access to all of these things and pass them down as needed. ```clojure (defn handler [sendgrid-service wikipedia-service pool request] (cond (= (:uri request) "/hello") (hello-handler sendgrid-service pool request) (= (:uri request) "/marco") (marco-handler wikipedia-service pool request) :else {:status 404})) ``` With just three stateful components and two handlers this is manageable, but beyond three arguments using positional arguments is overly burdensome and error-prone. ```clojure (defn handler [sendgrid-service wikipedia-service pool some-service other-thing oh-no request] (cond (= (:uri request) "/hello") (hello-handler sendgrid-service pool request) (= (:uri request) "/marco") (marco-handler wikipedia-service pool request) (= (:uri request) "/thing") (some-handler some-service sendgrid-service request) (= (:uri request) "/thing2") (some-handler some-service oh-no sendgrid-service other-thing request) (= (:uri request) "/thing3") (some-handler some-service oh-no other-thing request) ;; ... * 100 :else {:status 404})) ``` The solution is to put all stateful components into a single map, popularly called the "system." ```clojure {:sendgrid-service sendgrid-service :wikipedia-service wikipedia-service :pool pool} ``` Then the handler just threads down this one map to all the entry-points ```clojure (defn handler [system request] (cond (= (:uri request) "/hello") (hello-handler system request) (= (:uri request) "/marco") (marco-handler system request) :else {:status 404})) ``` and individual handlers "declare" which of these components they are interested in by only pulling those keys out of the map. ```clojure (defn hello-handler [{:keys [sendgrid-service pool]} request] ...) (defn marco-handler [{:keys [wikipedia-service pool]} request] ...) ``` This way it is still declared up front what stateful components some bit of code needs to do its work, but the "wiring" code for each entry-point can stay uniform. This technique, where all a piece of code needs to do to get access to a resource is "declare" that they want it is usually called "[Dependency Injection](https://martinfowler.com/articles/injection.html)." Important to note also that after this "entry-point" code should generally pass down things explicitly. Passing the whole system is a hand-gun pointed at a foot-foot. ```clojure (defn marco-handler [{:keys [wikipedia-service pool] :as system} request] ... ;; Back to not knowing what this could be doing deep down... (some-code system) ...) ``` ## Starting and Stopping the System There needs to be some code that actually starts up all the components of the system. ```clojure (defn start-system [] (let [config (load-config) sendgrid-service (make-sendgrid-service config) wikipedia-service (make-wikipedia-service) pool (make-pool config)] {:config config :sendgrid-service sendgrid-service :wikipedia-service wikipedia-service :pool pool})) ``` Some stateful bits might depend on other stateful bits to get started. In the above example the hypothetical Sendgrid service and database connection pool depend on some config object which is loaded earlier. Clearest example of that is the server instance itself. If it is to be put into the system, then it will need all the things started before it. ```clojure (defn start-system [] (let [config (load-config) sendgrid-service (make-sendgrid-service config) wikipedia-service (make-wikipedia-service) pool (make-pool config) system-so-far {:config config :sendgrid-service sendgrid-service :wikipedia-service wikipedia-service :pool pool} server (start-server system-so-far)] (assoc system-so-far :server server))) ``` ```clojure (defn hello-handler [{:keys [sendgrid-service pool]} request] ...) (defn marco-handler [{:keys [wikipedia-service pool]} request] ...) (defn handler [system request] (cond (= (:uri request) "/hello") (hello-handler system request) (= (:uri request) "/marco") (marco-handler system request) :else {:status 404})) (defn start-server [system] (jetty/run-server (partial #'handler system) {:port 1234 :join? false})) ``` The reason you would want the server to be part of the system ties back to the REPL workflow. If you change or add some stateful component you might want to stop an old running system and start up a new one. The running http server is likely to be one of these things you would want to restart. To properly do this, every stateful resource which might have shutdown logic needs to provide a function which shuts it down. ```clojure (defn stop-server [server] (.stop server)) ``` And then some larger function needs to be able to stop each component of the system, doing so in the reverse order they were started ideally. ```clojure (defn stop-system [system] (stop-server (:server system)) (stop-connection-pool (:pool system)) ;; In this hypothetical the sendgrid service ;; has shutdown logic, but the wikipedia service does not. (stop-sendgrid-service (:sendgrid-service system))) ``` Then to facilitate working with the "current system" in the REPL it does need to be bound to some global value. ```clojure (ns example.repl (:require [example.system :as system])) (def system nil) (defn start-system! [] (alter-var-root #'system (constantly (system/start-system)))) (defn stop-system! [] (system/stop-system system) (alter-var-root #'system (constantly nil))) (comment (start-system!) (stop-system!)) ``` A developer can then reference `example.repl/system` in their REPL session to see the currently running system and pull out values to test calls to functions they are playing with. ```clojure (some-db-function (:pool example.repl/system) 123 "abc") ``` And while this does give birth to a global stateful thing, the problems of that are fairly mitigated. For one, it can reasonably exist only in development. In the code above there is a distinct namespace just for giving a `start-system!` and `stop-system!` to be used in development. On the tooling side you can even make sure this file isn't included in production builds with something like [deps.edn aliases.](https://clojure.org/reference/deps_and_cli) ```clojure ;; Assuming example/repl.clj is under dev-src {:paths ["src"] :aliases {:dev {:paths ["dev-src"]}}} ``` ## So what is integrant for? As I mentioned before, you need to start all of your stateful components in the right order and stop them all in the reverse of that order. ```clojure (defn start-system [] (let [config (load-config) sendgrid-service (make-sendgrid-service config) wikipedia-service (make-wikipedia-service) pool (make-pool config) system-so-far {:config config :sendgrid-service sendgrid-service :wikipedia-service wikipedia-service :pool pool} server (start-server system-so-far)] (assoc system-so-far :server server))) (defn stop-system [system] (stop-server (:server system)) (stop-connection-pool (:pool system)) (stop-sendgrid-service (:sendgrid-service system))) ``` A workable metaphor for this is that each component "depends on" the components that need to start before it and that these dependencies form a graph. Integrant, and libraries like it, provide ways to explicitly model that graph of dependencies. ```mermaid graph LR A[:config] --> B[:sendgrid-service] D[:wikipedia-service] --> C B --> C[:server] A --> E[:pool] E --> C A --> C ``` This reduces boilerplate and potential error-prone-ness with the `start-system` and `stop-system` functions that logically need to exist. In Integrant's case the dependency information is encoded into a map ```clojure {:config {} :sendgrid-service {:config (ig/ref :config)} :wikipedia-service {} :pool {:config (ig/ref :config)} :server {:config (ig/ref :config) :sendgrid-service (ig/ref :sendgrid-service) :wikipedia-service (ig/ref :wikipedia-service)}} ``` and the information about how each thing is started and stopped is registered with the `ig/init-key` and `ig/halt-key` multimethods. ```clojure (defmethod ig/init-key :pool [_ {:keys [config]}] (HikariDataSource. (doto (HikariConfig.) (.setJdbcUrl (config/lookup config :JDBC_URL))))) (defmethod ig/halt-key! :pool [_ pool] (.close pool)) ``` Starting the system now means calling `ig/init-key` on everything in graph traversal order and calling `ig/halt-key` in the reverse order. The pieces needed for a REPL workflow can then be brought in [via a library.](https://github.com/weavejester/integrant-repl) Partially because multimethod registration is global - and partially because its good practice regardless - the keys for different integrant components are generally made namespaced. ```clojure (ns example.system (:require [integrant.core :as ig])) (def system-map {::config {} ::sendgrid-service {::config (ig/ref ::config)} ::wikipedia-service {} ::pool {::config (ig/ref ::config)} ::server {::config (ig/ref ::config) ::sendgrid-service (ig/ref ::sendgrid-service) ::wikipedia-service (ig/ref ::wikipedia-service)}}) ``` So in this context, the `::pool` syntax will expand to `:example.system/pool`. This helps avoid conflicts with multimethod registration, but also can be used in conjunction with features like [`as-alias`](https://clojure.atlassian.net/browse/CLJ-2123) to add some semantic and syntactic distinction to pulling components out of the system. ```clojure (ns example.handlers ;; Without as-alias it would be really easy ;; to get circular dependencies doing this. (:require [example.system :as-alias system])) (defn some-handler [{::system/keys [pool server]} request] ...) ``` Again, I find it important to note that integrant is just one of many libraries that do this "automatic wiring." Many have sprung up over the years, and it seems like there are more yet to come. There are tradeoffs and quirks to all of them. The important idea is just to pass things down as arguments and to start with the system maps at entry-points. ## Tying it all together To properly structure a Clojure App * **DO NOT** have stateful components be implicit ```clojure (defn do-thing [name] (slurp (str "https://website.com/get-info/" name))) ``` * **DO NOT** have stateful components be global constants ```clojure (def pool (make-db-pool)) (defn lookup-chair [chair-id] (jdbc/execute! pool ["SELECT * FROM chair WHERE chair.chair_id = ?"])) ``` * **DO** have code be safe to reload in the REPL and have changes be reflected immediately ```clojure (defn root-handler [request] ...) (defn start-server [] (jetty/run-server #'root-handler {:port 1234})) ``` * **DO** provide REPL workflow helpers in comments ```clojure (defn root-handler [request] ...) (defn start-server [] ...) (comment (start-server)) ``` * **DO** pass stateful components explicitly as arguments. ```clojure (defn lookup-chair [pool chair-id] (jdbc/execute! pool ["SELECT * FROM chair WHERE chair.chair_id = ?"])) ``` * **DO** have a "system map" which can be threaded to entry-points ```clojure (defn start-system [] (let [config (load-config) sendgrid-service (make-sendgrid-service config) wikipedia-service (make-wikipedia-service) pool (make-pool config) system-so-far {:config config :sendgrid-service sendgrid-service :wikipedia-service wikipedia-service :pool pool} server (start-server system-so-far)] (assoc system-so-far :server server))) ``` * **DO** have REPL helpers for working with the system. ```clojure (def system nil) (defn start-system! [] (alter-var-root #'system ...)) (defn stop-system! [] (alter-var-root #'system ...)) (comment (start-system!) (stop-system!)) ``` * **DO** pull out dependencies from the system using destructuring ```clojure (defn hello-handler [{:keys [pool]} request] ...) ``` * **DO** use namespaced keys in the "system map" ```clojure (ns example.system) (defn start-system [] (let [config (load-config) sendgrid-service (make-sendgrid-service config) wikipedia-service (make-wikipedia-service) pool (make-pool config) system-so-far {::config config ::sendgrid-service sendgrid-service ::wikipedia-service wikipedia-service ::pool pool} server (start-server system-so-far)] (assoc system-so-far ::server server))) ``` ```clojure (ns example.handlers (:require [example.system :as-alias system])) (defn hello-handler [{::system/keys [pool]} request] ...) ``` * **MAYBE** use a library like integrant to reduce boilerplate when starting and stopping the system. --- <details> <summary>Sidenotes</summary> ### Different Places of Injection A technique that was harder to show with how the code examples built up but is equally valid is attaching the system as "request context." By this I mean, have some middleware which takes the system and a handler and injects the system into the request under some key. ```clojure (defn wrap-system [system handler] (fn [request] (handler (assoc request :system system)))) ``` And then have entry-points pull what they need out from that nested key. ```clojure (defn some-handler [request] (let [{:keys [pool]} (:system request)] ...)) ``` Or even attach all the values into the request at the top level. ```clojure (defn wrap-system [system handler] (fn [request] (handler (merge request system)))) ``` ```clojure (defn some-handler [{:keys [pool] :as request}] ...) ``` This all has the potential benefit of avoiding the need to wire the system explicitly to code that depends on it since it is now all contained in one object This sort of technique - by other names and syntaxes - is pretty popular in other worlds like JavaScript and Python. ### Is testing really important enough to do all this? I will claim, without a convincing argument top of mind but with strong feelings in the core of my heart, that writing code like this makes it easier to reason about and refactor. The testing argument is just easier to make since I can more clearly show mechanical deficiencies in other approaches. ### What about mount? There is a library called mount which uses the namespace loading graph as its mechanism for knowing what order to start and stop things. ```clojure (mount/defstate other-thing :start (make-other-thing) :stop (stop other-thing)) (mount/defstate thing :start (f other-thing) :stop (stop thing)) ``` This is better than using regular `def`s to store stateful components since you can start and stop everything from the REPL. What this doesn't solve for is how call-sites get handles to the stateful components. Without discipline to not touch them directly, this will lead to an overall architecture indistinguishable from just using regular `def`s. While wouldn't recommend it for those reasons, if you already have an app structured "the wrong way" it can be an incremental step in the right direction to get things REPL-able. ### Contract Narrowing In all examples I showed there are only like 5 actual stateful components. This is fine, but you might not feel the automatic wiring of integrant is "worth the cost" unless there is more than that. One way that you can easily end up with more than a handful of components - even if you just have a single database - is if you practice "contract narrowing." That is - instead of passing "`pool`" to consumers, which will let them do any arbitrary operations on the database, pass an object with a "narrower contract" like a "`user-service`." ```clojure (defprotocol UserService (find-by-id [_ id])) (defrecord UserServiceImpl [pool] (find-by-id [_ id] (jdbc/execute! pool ["SELECT * FROM user WHERE user.user_id = ?" id]))) ``` ```clojure (defn start-system [] (let [pool (start-pool) user-service (->UserServiceImpl pool)] {::pool pool ::user-service user-service})) ``` ```clojure (defn user-handler [{::system/keys [user-service]} request] ... (user-service/find-by-id user-service 123) ...) ``` With this sort of code it is a lot more reasonable that you would end up with enough components that manually wiring their dependencies would get troublesome. </details> Expand the section above for further elaboration. Brag about your holiday plans in the comments below.Wed, 07 Dec 0022 05:00:00 +0000A Practical Advent of Codehttps://mccue.dev/pages/12-3-22-practical-advent <meta property="og:image" content="https://mccue.dev/pages/12-3-22-santa.png" /> <meta name="twitter:image" content="https://mccue.dev/pages/12-3-22-santa.png"> I've never done more than a few days of [Advent of Code](https://adventofcode.com/). I'm sure its fun if you're the kind of person who likes doing those sorts of puzzles, but that's not really what I'm into. My jam is really finicky, relatively small problems. Problems that everyone can reasonably _do_ and could come up in real code, but where it's really hard to be happy with a solution. So that's what this is. I'm starting three days in, and I'm nowhere close to prepared to give a challenge a day, but I want to share the sorts of problems that keep me up at night. ## The Challenge The following three samples of JSON come from [the activity streams specification](https://www.w3.org/TR/activitystreams-core/). If you have misguided dreams of making the next Twitter, maybe you've looked at this too. ```json { "@context": "https://www.w3.org/ns/activitystreams", "summary": "A note", "type": "Note", "content": "My dog has fleas." } ``` ```json { "@context": { "@vocab": "https://www.w3.org/ns/activitystreams", "ext": "https://canine-extension.example/terms/", "@language": "en" }, "summary": "A note", "type": "Note", "content": "My dog has fleas.", "ext:nose": 0, "ext:smell": "terrible" } ``` ```json { "@context": [ "https://www.w3.org/ns/activitystreams", { "css": "http://www.w3.org/ns/oa#styledBy" } ], "summary": "A note", "type": "Note", "content": "My dog has fleas.", "css": "http://www.csszengarden.com/217/217.css?v=8may2013" } ``` In all of them, there is a piece of information we will call "the vocabulary". If the context is a string, the vocabulary is that string. If the context is an object, the vocabulary is under the `@vocab` key. If the context is an array, the vocabulary is a string in the first index of the array. So in all of these examples the vocabulary is `"https://www.w3.org/ns/activitystreams"` Assume one of these shapes of JSON is in a file called `activity.json`. Your job is to extract the vocabulary out of that file and print it. ## Restrictions * If you are employed, you need to use the JSON library you use for work. * You need to explain your code to St. Peter when you die. I am personally most interested in solutions on the "static" side of the world - Java, C#, Rust, etc. - because this is where the solutions really go from "obvious" to "cursed" and "magic". Leave solutions in the comments below. --- <details> <summary>My Solution</summary> I've been doodling on a JSON library for Java that I might find time to write about later, but in that my solution is the following. ```java import dev.mccue.json.Json; import dev.mccue.json.decode.alpha.Decoder; import java.io.IOException; import java.nio.file.Files; import java.nio.file.Path; public class Main { public static void main(String[] args) throws IOException { var json = Json.readString( Files.readString(Path.of("activity.json")) ); var vocab = Decoder.field( json, "@context", context -> Decoder.oneOf(context, Decoder::string, Decoder.index(0, Decoder::string), Decoder.field("@vocab", Decoder::string) ) ); System.out.println(vocab); } } ``` Try it out for yourself if you have a mind to - I would appreciate feedback. ```xml <dependencies> <dependency> <groupId>dev.mccue</groupId> <artifactId>json</artifactId> <version>0.0.9</version> </dependency> <dependency> <groupId>dev.mccue</groupId> <artifactId>json.decode.alpha</artifactId> <version>0.0.9</version> </dependency> </dependencies> ``` </details> Sat, 03 Dec 0022 05:00:00 +0000Better Java logging, inspired by Clojure and Rust (II)https://mccue.dev/pages/12-3-22-better-java-logging-2 Around three months ago [I wrote a pretty long rant about a Java logging library I doodled out](https://mccue.dev/pages/9-25-22-better-java-logging). My goal was to dream up something that would fill the same ecosystem role as SLF4J but with the primary goal of supporting the logging of structured data. I got an absolute mountain of good feedback. After spending some time reading and internalizing the responses, I realized a few things. 1. I need to more effectively communicate my intent. Writing about the nitty-gritty details about choices in API design is my [marmalade sandwich](https://www.youtube.com/watch?v=7UfiCa244XE), but I need to do that *after* explaining what I am going to do and why. 2. Going forward, I need to properly contextualize how everything would interact with OpenTelemetry. Yes, OTel provides an API for propagating context, but OTel would not be suitable to include in a library. There is still a role here for something which lets a library interact with that concept without including OTel outright. 3. All the opining about `ExtentLocal`s (now called `ScopedValue`s) was a bit tangential, but I do stand by the choice of making the API only propagate scope inside lambdas in preparation. 4. I need to be more mindful of allocations in general, but still think that having a reified but restricted set of log values is important and paying whatever cost there is to allocate those objects is worth it. 5. The API I was making was doing way too much. The ring buffer, the context stack, all the thread and timing info I was attaching to `Log` records - all of those are still valid directions to go with a logging _implementation_, but not with a logging _interface_. Related to the last two points - I've written up a new draft of the API without any of the context implementation and minimizing allocations down to just the log entries. This has required a few changes - farewell my sweet prince `log`, the method which logs a `Log` - but so far seems to be okay. I'm not entirely sold on how I am handling context now though. You can find all of that on my GitHub under [log.beta](https://github.com/bowbahdoe/log.beta). I am writing this tiny update to give myself permission to put this on [the hammock](https://www.youtube.com/watch?v=f84n5oFoZBc) for a time and move on to some other fiddly, controversial, and fun topics. Stay tuned for that. --- If you want to talk about this, feel free to reach out directly by any means necessary. Sat, 03 Dec 0022 05:00:00 +0000Better Java logging, inspired by Clojure and Rusthttps://mccue.dev/pages/9-25-22-better-java-logging <meta property="og:image" content="https://mccue.dev/pages/9-25-22-ill-steal-it.png" /> <meta name="twitter:image" content="https://mccue.dev/pages/9-25-22-ill-steal-it.png"> > Existing logging libraries are based on a design from the 80s and early 90s. Most of the systems at the time where developed in standalone servers where logging messages to console or file was the predominant thing to do. Logging was mostly providing debugging information and system behavioural introspection. > > Most of modern systems are distributed in virtualized machines that live in the cloud. These machines could disappear any time. In this context logging on the local file system isn't useful as logs are easily lost if virtual machines are destroyed. Therefore it is common practice to use log collectors and centralized log processors. The ELK stack it has been predominant in this space for years, but there are a multitude of other commercial and open-source products. > > Most of these systems have to deal with non structured data represented as formatted strings in files. The process of extracting information out of these strings is very tedious, error prone, and definitely not fun. But the question is: why did we encode these as strings in the first place? This is just because existing log frameworks, which have been redesigned in various decades follow the same structure as when systems lived on the same single server for decades. > > I believe we need the break free of these anachronistic designs and use event loggers, not message loggers, which are designed for dynamic distributed systems living in cloud and using centralized log aggregators. > > \- Motivation for [`µ/log`](https://cljdoc.org/d/com.brunobonacci/mulog-slack/0.9.0/doc/readme#motivation). I think this description is essentially correct. Existing logging libraries, especially those in the Java ecosystem, are not ["fit predators"](https://youtu.be/A-mxj2vhVAA?t=543) for the modern world. I claim that we can do meaningfully better. Whether the design I come up with at the end of this is that better solution I do not know, but if it isn't I hope it spurs _someone_ to make whatever is. ## Prior Work ### µ/log [`µ/log`](https://github.com/BrunoBonacci/mulog) is a Clojure library written by the author of the quote above. Clojure is a dynamically typed functional language. Not every design choice made by [`µ/log`](https://github.com/BrunoBonacci/mulog) will make sense for the statically typed Java, but there is bound to be a lot to pull. <figure> <a href="https://www.youtube.com/watch?v=WhwqBUCYQO0"> <img src="/pages/9-25-22-ill-steal-it.png" alt="I'll steal it. No one will ever know!"/> </a> <figcaption>I'll steal it. No one will ever know!</figcaption> </figure> A logging call in [`µ/log`](https://github.com/BrunoBonacci/mulog) looks something like this. ```clojure (μ/log :minesweeper.event/clicked-square :was-bomb true) ``` This first argument is the event type. It is stated best practice that the event type should have both a "namespace" and a "name". `minesweeper.event` and `clicked-square` respectively. The rest of the arguments are key-value pairs of arbitrary data. The convention in Clojure would be for that arbitrary data to be made of "data primitives" instead of "named aggregates". ```java // In Java, data is often tied to a "named aggregate" public final class Dog { private final String name; public Dog(String name) { this.name = name; } // And how data is accessed from that aggregate // is totally arbitrary, though conventions like // getX can allow for heuristics. public String retrieveName() { return this.name; } } ``` ```clojure ;; In clojure it would be uncommon to have a named "Dog" object (deftype Dog [name]) (μ/log :cool.dog/barked :loud true :dog (new Dog "jenny")) ;; Instead, you would have a "raw map" with data about said doggo. (μ/log :cool.dog/barked :loud true :dog {:name "rhodie"}) ``` So while [there are mechanisms to handle it](https://cljdoc.org/d/com.brunobonacci/mulog/0.9.0/doc/howtos/how-to-json-encode-custom-java-classes), serializing user defined classes is a relatively uncommon task. Downstream consumers will generally expect values to be these "base" values - lists, maps, strings, numbers - or directly decomposable into them. These data are then merged with a standard set of metadata about the log. * A timestamp saying when the "event" happened. * A UUID. The timestamp is pretty self-explanatory, but the UUID is a bit custom. It isn't a standard UUIDv4, but instead a construct called a "Flake" (as in snowflake, no two alike). A Flake has the following properties. * 192 bits. 128 random, 64 time based. * New ones compare greater than older ones. * Comparison order is maintained in String and byte form. * String representation is URL safe * Can be created in under 50 nanoseconds An example value looks like this: `4lIfs0B6IRjDMHo6g2Tbgrf4lzikQNXl`. I'll freely admit I don't understand the full scope of the problems this is intended to solve, but take it on faith that it's a good design and sensible alternative to `UUID.randomUUID()`. In addition to that standard metadata, any global or local (bound within a lexical scope) context is included. Global context is intended to be set at the start of the program with unchanging metadata. ```clojure (μ/set-global-context! {:app-name "cool-trail-cam"}) ``` Internally this information is stored in an [`atom`](https://clojuredocs.org/clojure.core/atom) - a thin wrapper over an [`AtomicReference`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/atomic/AtomicReference.html). ```clojure (defonce ^{:doc "The global logging context is used to add properties which are valid for all subsequent log events. This is typically set once at the beginning of the process with information like the app-name, version, environment, the pid and other similar info."} global-context (atom {})) ``` Local context is bound using the `μ/with-context` macro. ```clojure (μ/with-context {:request-id "685a40fd-8740-4a2f-85ae-d6a4b2c02bb0"} (μ/log :cool.dog/barked :loud true :dog {:name "rhodie"})) ``` By virtue of being stored and retrieved from a [`ThreadLocal`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/lang/ThreadLocal.html), this context will be propagated even across function calls. ```clojure (def ^{:doc "The local context is local to the current thread, therefore all the subsequent call to log withing the given context will have the properties added as well. It is typically used to add information regarding the current processing in the current thread. For example who is the user issuing the request and so on."} local-context (ut/thread-local nil)) ``` All of this information merges together into a map that looks something like this. ```clojure {:mulog/event-name :cool.dog/barked, :mulog/timestamp 613022400000, :mulog/trace-id #mulog/flake "4lIfs0B6IRjDMHo6g2Tbgrf4lzikQNXl" :mulog/namespace "cool.dog" :app-name "cool-trail-cam" :request-id "685a40fd-8740-4a2f-85ae-d6a4b2c02bb0" :loud true :dog {:name "rhodie"}} ``` Then this map is "published". The internals of how this publishing works are [better documented elsewhere](https://cljdoc.org/d/com.brunobonacci/mulog/0.9.0/doc/%CE%BC-log-internals), but the gist of it is that the whole map is put into a ring buffer. An asynchronous process reads from that ring buffer and dumps to ring buffers. Any number of asynchronous processes then periodically read from these buffers and do the work of to printing to stdout, sending it to Cloudwatch, etc. `µ/log` calls each of these asynchronous processes which handle logs a "publisher". This can be a bit confusing in terminology, but your code "publishes" logs to a ring buffer, an unnamed process forwards these logs to more ring buffers, then "publishers" take those logs and do work with them. <figure> <a href="https://cljdoc.org/d/com.brunobonacci/mulog/0.9.0/doc/%CE%BC-log-internals"> <img src="/pages/9-25-22-ring-buffer-diagram.png" alt="ring buffer publisher scheme diagram"/> </a> <figcaption>Diagram from `µ/log`'s docs.</figcaption> </figure> The goals of this scheme are to 1. Make it possible to "log with abandon." Publishing to a ring buffer should take only a few hundred nanoseconds. 2. Make it so that adding publishers doesn't impact overall performance. They all get their own buffers and work at their own pace. 3. Make observability resilient to failures. If publishers "get behind" for whatever reason - us-west-2 is down, [freak gasoline fight accident](https://www.youtube.com/watch?v=3Huc47Dqsg8) - a ring buffer might lose events but in exchange there won't be an infinite backlog to work through to send off more current data. ```clojure (μ/start-publisher! {:type :console}) (μ/start-publisher! {:type :simple-file :filename "/tmp/mulog/events.log"}) (μ/start-publisher! {:type :elasticsearch :url "http://localhost:9200/"}) ``` The final wrinkle is that in addition to logging "events" - a record of something that happened at a given point in time - `µ/log` lets you record "traces" - a record of a process that occurred over a span of time. ```clojure (μ/trace :cool-dog/chasing-squirrel [:got-away "hope so"] (chase-squirrel)) ``` Unlike a normal log this adds metadata signifying the duration of the process, whether that process terminated with an exception, and parent traces that the current trace is nested under. ```clojure {:mulog/event-name :cool-dog/chasing-squirrel, :mulog/timestamp 609652800000, :mulog/duration 777600000 :mulog/trace-id #mulog/flake "4lLBQ0kk-mmdCFrwI8Ravb6c8S3ddpq-" :mulog/root-trace #mulog/flake "4lLBQ0kbx1weOm_TT5Wynok6GLBllXQC" :mulog/outcome :ok :mulog/namespace "cool.dog" :app-name "cool-trail-cam" :got-away true} ;; If it fails, outcome will be :error and the exception included. {:mulog/event-name :cool-dog/chasing-squirrel, :mulog/timestamp 609652800000, :mulog/duration 777600000 :mulog/trace-id #mulog/flake "4lLBQ0kk-mmdCFrwI8Ravb6c8S3ddpq-" :mulog/root-trace #mulog/flake "4lLBQ0kbx1weOm_TT5Wynok6GLBllXQC" :mulog/outcome :error :mulog/exception ... :mulog/namespace "cool.dog" :app-name "cool-trail-cam" :got-away true} ``` ### tracing `tracing` is a Rust library with holistically similar goals to `µ/log`. There is a similar Rust library named `slog` which deserves mention. It has been around longer and has a stable API, but `tracing` has better support for interacting with Rust's async ecosystem and a larger active community. To quote [`slog`'s docs](https://github.com/slog-rs/slog#you-might-consider-using-tracing-instead). > Please check tracing and see if it is more suitable for your use-case. > It seems that it is already a go-to logging/tracing solution for Rust. My impression is that a lot of the broad strokes are the same, so I am going to focus mostly on `tracing`. `tracing` is built on top of [three core concepts](https://docs.rs/tracing/latest/tracing/index.html#core-concepts), spans, events, and subscribers. Spans are records of a "period of time" with a beginning and an end. This is a close parallel to what `µ/log` calls a "trace". ```rust let span = span!(Level::INFO, "toasting toast"); let _guard = span.enter(); // when _guard is dropped, the span is closed. ``` Semantically, this makes use of a property of Rust that doesn't exist in the JVM. When a struct in Rust is no longer in lexical scope, it is immediately "dropped." Dropping can implicitly run arbitrary code such as freeing memory, closing a socket, or - in this case - doing whatever work is needed to "close" a span. Spans also have a new kind of metadata attached to them in the "Level." The level serves two purposes 1. It hints to external systems how "serious" something is. If your service starts pumping out 10x as many `ERROR` spans or events, that is probably a sign something is afoot. 2. It lets internal systems make sensible decisions about "filtering". While running code locally `TRACE` spans and events might be relevant, but probably irrelevant in production. Events are records of something that happened at a single point in time. This is closest to what we would classically call a "log." ```rust event!(Level::INFO, "toast popped in"); ``` Events interact with spans in that any events emitted while a span is active will be "nested" under that span. ```rust let span = span!(Level::INFO, "toasting toast"); let _guard = span.enter(); event!(Level::INFO, "toast popped in"); // We know that the toast burned while toasting toast // but does a toaster toast toast or does toast toast toast? event!(Level::WARN, "toast burned"); // when _guard is dropped, the span is closed. ``` Spans and events are by default sent to the "global subscriber", which is meant to be set at the start of the program. ```rust let global_subscriber = ConsoleLoggingSubscriber::new(); tracing::subscriber::set_global_default(global_subscriber) .expect("Failed to set"); ``` This is stored inside a static mutable variable, which is prevented from being set twice or in a race by way of a global `AtomicUsize`. ```rust static GLOBAL_INIT: AtomicUsize = AtomicUsize::new(UNINITIALIZED); const UNINITIALIZED: usize = 0; const INITIALIZING: usize = 1; const INITIALIZED: usize = 2; static mut GLOBAL_DISPATCH: Option<Dispatch> = None; // ... pub fn set_global_default(dispatcher: Dispatch) -> Result<(), SetGlobalDefaultError> { if GLOBAL_INIT.compare_and_swap( UNINITIALIZED, INITIALIZING, Ordering::SeqCst) == UNINITIALIZED { unsafe { GLOBAL_DISPATCH = Some(dispatcher); } GLOBAL_INIT.store(INITIALIZED, Ordering::SeqCst); EXISTS.store(true, Ordering::Release); Ok(()) } else { Err(SetGlobalDefaultError { _no_construct: () }) } } ``` Subscribers are notified when an event happens, when a span is entered, when a span is exited, and a few other things. ```rust pub trait Subscriber: 'static { fn enabled(&self, metadata: &Metadata<'_>) -> bool; fn new_span(&self, span: &Attributes<'_>) -> Id; fn record(&self, span: &Id, values: &Record<'_>); fn record_follows_from(&self, span: &Id, follows: &Id); fn event(&self, event: &Event<'_>); fn enter(&self, span: &Id); fn exit(&self, span: &Id); fn register_callsite(&self, metadata: &'static Metadata<'static>) -> Interest { ... } fn max_level_hint(&self) -> Option<LevelFilter> { ... } fn event_enabled(&self, event: &Event<'_>) -> bool { ... } fn clone_span(&self, id: &Id) -> Id { ... } fn drop_span(&self, _id: Id) { ... } fn try_close(&self, id: Id) -> bool { ... } fn current_span(&self) -> Current { ... } unsafe fn downcast_raw(&self, id: TypeId) -> Option<*const ()> { ... } } ``` Okay, maybe more than a few other things. These methods are related to filtering events. Given some metadata, a subscriber can say whether they always want to be informed of an event, never want to be informed, or will need to do some runtime check to know if an event is relevant. Because of the magic of Rust, these checks can sometimes totally remove the runtime cost of recording irrelevant events. ```rust pub trait Subscriber: 'static { fn enabled(&self, metadata: &Metadata<'_>) -> bool; // ... fn register_callsite(&self, metadata: &'static Metadata<'static>) -> Interest { ... } fn max_level_hint(&self) -> Option<LevelFilter> { ... } fn event_enabled(&self, event: &Event<'_>) -> bool { ... } } ``` These methods allow for creating and manipulating spans. This is needed because `Subscriber`s are responsible for tracking information about spans like their start and end times. ```rust pub trait Subscriber: 'static { fn new_span(&self, span: &Attributes<'_>) -> Id; fn record(&self, span: &Id, values: &Record<'_>); fn record_follows_from(&self, span: &Id, follows: &Id); // ... fn clone_span(&self, id: &Id) -> Id { ... } fn drop_span(&self, _id: Id) { ... } fn try_close(&self, id: Id) -> bool { ... } fn current_span(&self) -> Current { ... } } ``` Here be dragons. ```rust pub trait Subscriber: 'static { // ... unsafe fn downcast_raw(&self, id: TypeId) -> Option<*const ()> { ... } } ``` And these are the more digestible ones. ```rust pub trait Subscriber: 'static { // ... fn event(&self, event: &Event<'_>); fn enter(&self, span: &Id); fn exit(&self, span: &Id); // ... } ``` The values supported by this system are defined implicitly by a "visitor" trait, where each method on the trait corresponds to a kind of data. ```rust pub trait Visit { fn record_debug(&mut self, field: &Field, value: &dyn Debug); fn record_value(&mut self, field: &Field, value: Value<'_>) { ... } fn record_f64(&mut self, field: &Field, value: f64) { ... } fn record_i64(&mut self, field: &Field, value: i64) { ... } fn record_u64(&mut self, field: &Field, value: u64) { ... } fn record_i128(&mut self, field: &Field, value: i128) { ... } fn record_u128(&mut self, field: &Field, value: u128) { ... } fn record_bool(&mut self, field: &Field, value: bool) { ... } fn record_str(&mut self, field: &Field, value: &str) { ... } fn record_error( &mut self, field: &Field, value: &(dyn Error + 'static) ) { ... } } ``` Values can therefore be a base numeric type (`f64`, `i64`, `u64`, `i128`, `u128`), a boolean, a string, a Rust Error, or anything that has a Debug representation. In addition, there is support for values defined via the [valuable crate](https://docs.rs/valuable/latest/valuable/), which allows for more arbitrary introspection. This can be seen as a more "typed" version of the system in `µ/log`. Events are still made up of "plain data" but the rolodex of what is allowed has a more explicit structure. Similar to the current state of things in Java, there is a large portion of the ecosystem which primarily logs text based messages, often through the [`log` crate](https://crates.io/crates/log). To deal with this, they provide a crate which massages the text based messages into tracing events called [`tracing_log`](https://docs.rs/tracing-log/latest/tracing_log/). For more comprehensive coverage, I recommend perusing [this RustConf talk](https://www.youtube.com/watch?v=engm2Wqfgjk) inclusive or [this blog post](https://tokio.rs/blog/2019-08-tracing), both by `tracing`'s primary maintainer. ### SLF4J The most popular logging facade for Java is [SLF4J](https://www.slf4j.org/news.html). Logging libraries like [Logback](https://logback.qos.ch/) and [Log4J](https://logging.apache.org/log4j/2.x/) provide a superset or a [super-dee-duperset](https://en.wikipedia.org/wiki/Log4Shell) of its functionality while acting as the implementation for any code that calls [SLF4J](https://www.slf4j.org/news.html). Nowadays, the mechanism for a library to "act as the implementation" is to provide an implementation of [org.slf4j.spi.SLF4JServiceProvider](https://www.slf4j.org/api/org/slf4j/spi/SLF4JServiceProvider.html) via the [service loader mechanism](https://mccue.dev/pages/1-20-22-service-provider-interface). Most Java developers only need to interact with this at the level of knowing that they need to add both the `slf4j-api` and `logback-classic` as dependencies to their project. I still think it is worth noting because this ability to pick a logging implementation at runtime has been crucial to SLF4J's success. In the most common usage pattern, a `private static final` field is set to the result of calling `LoggerFactory.getLogger(Class<?>)`. This gets a logger where logged messages know about the class they are logged from. ```java public final class Main { private static final Logger log = LoggerFactory.getLogger(Main.class); public static void main(String[] args) { log.info("Hello, world"); } } ``` Log messages can be plain english. ```java log.warn("Somebody stole my plain bagel!"); ``` Or they can contain placeholders for data to be formatted into. ```java log.info("I just reached level {} in Mouse Quest", 987413); ``` With special support for logging exceptions. ```java log.error("Hulu gives me 5 ads in a row and I pay for it!", e); ``` Every log is associated with a level just like `tracing`. This functions as metadata as usual, but can also be used to avoid making potentially expensive log calls when the associated level is not enabled. ```java if (log.isDebugEnabled()) { log.debug("expensive data here {}", expensiveProcesss()); } ``` Unlike tracing, data associated with the log is intended to be directly stuffed into the message. Despite the goal being ostensibly to produce english, pseudo-structured logs tend to be common. ```java log.info( "Staring background process: batchId={}, startTime={}", batchId, startTime ); ``` That being said, there *is* some support for attaching structured data via the "Mapped Diagnostic Context" (MDC) API. ```java try { MDC.put("request_id", request_id); log.info("This will have the request id available"); } finally { MDC.clear(); } ``` But that system is mechanically cumbersome to use and needs to be configured explicitly within the underlying logging implementation. [People don't use it that often](https://grep.app/search?q=MDC.put&case=true). It also has no way to become compatible with the upcoming [`ExtentLocal`](https://openjdk.org/jeps/429) API, so implementations are forever locked to using [`ThreadLocal`](https://docs.oracle.com/en/java/javase/19/docs/api/java.base/java/lang/ThreadLocal.html)s. ```java // This try/finally structure cannot be adapted. try { MDC.put("something", "abc"); } finally { MDC.clear(); } // because ExtentLocals will only be able to be set in callbacks static final ExtentLocal<Context> CONTEXT = new ExtentLocal<>(); ExtendLocal.where(CONTEXT, ...) .run(() -> { /* code here */ }) ``` This is a problem, or at least potentially a problem, because [`ExtentLocal`](https://openjdk.org/jeps/429)s are designed to be considerably more efficient with a high number of threads. High numbers of threads will hopefully become the norm with Java 19+, so its worth thinking about. And as of the [very latest version](https://www.slf4j.org/news.html) there is now also an api for doing structured logging directly with [`KeyValuePair`](https://www.slf4j.org/api/org/slf4j/event/KeyValuePair.html)s and the fluent logging api. ```java logger.atInfo() .addKeyValue("cat", "fred") .addKeyValue("snuggles", true) .log("I love this cat."); ``` I'd say [the jury is out](https://www.reddit.com/r/java/comments/cgswxz/comment/eukttut/?utm_source=share&utm_medium=web2x&context=3) on whether this actually will lead to many folks doing any degree of structured logging in practice, but it is certainly a welcome addition. ### System.Logger [Since Java 9]((https://openjdk.org/jeps/264)), there has actually been a logger bundled which functions somewhat similarly to SLF4J. The major points of distinction are that there are no explicit `info`, `warn`, `error`, etc. methods for logging at a particular level, there is direct support for localization via a [ResourceBundle](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/ResourceBundle.html), and there is no parallel to MDC. ```java public interface Logger { String getName(); boolean isLoggable(Level level); default void log(Level level, String msg) { // ... } default void log(Level level, Supplier<String> msgSupplier) { // ... } default void log(Level level, Object obj) { // ... } default void log(Level level, String msg, Throwable thrown) { // ... } default void log(Level level, Supplier<String> msgSupplier, Throwable thrown) { // ... } default void log(Level level, String format, Object... params) { // ... } void log(Level level, ResourceBundle bundle, String msg, Throwable thrown); void log(Level level, ResourceBundle bundle, String format, Object... params); } ``` ### OpenTelemetry [OpenTelemetry](https://opentelemetry.io/) is definitely worth a mention, if not solely for the voice in the back of my head telling me that I am just making that but poorly and less well-thought-out. OpenTelemetry defines [a set of language agnostic standards](https://github.com/open-telemetry/opentelemetry-specification) for how "observability data" should be recorded within an application and how it should be communicated and propagated to the rest of the system. I am going to make the [controversial, yet brave](https://www.youtube.com/watch?v=nuIw3m96jMQ) decision to not explain much of it here. [The documentation is far more thorough than I can be](https://opentelemetry.io/docs/concepts/) and enough of the concepts overlap that explaining the distinctions between what I'm writing and what the OpenTelemetry automatic and manual instrumentation libraries provide would double the size of this post. My hope is that what comes out of this whole thought experiment could potentially serve as a frontend to its manual instrumentation component, but I'm not going to make that a focus. It is overall pretty cool though. If I had to give anything concretely negative about it, it would be that my first experience using the automatic instrumentation library was it causing the startup time for a few services to exceed our health check grace period. That sucked. As of right now though, luckily for me, [OpenTelemetry doesn't provide a stable way to emit structured logs](https://opentelemetry.io/docs/reference/specification/logs/overview/#plain-text-formats). ## Design Going in to this I want something which <ol> <li id="goal-1"> Hits the same general notes as `µ/log` and `tracing`. </li> <li id="goal-2"> Is tolerable to existing Java programmers. </li> <li id="goal-3"> Is suitable for inclusion in libraries just like SLF4J. </li> </ol> ### Logger So first task on the docket is to make an interface for our logger. ```java /** * A logger. */ public interface Logger { /** * Logs the log. * * @param log The log to log. */ void log(Log log); } ``` Nailed it. Now, there is actually some deeper reasoning here. In `µ/log`'s design all logs are juggled between ring buffers and passed between different consumers. Doing that is a lot easier when we make a `Log` a concrete thing that can be passed around. `µ/log` also gets away with no equivalent of `tracing`'s `span_enter` and `span_exit`. This leads me to believe (hand wavingly) that if you aren't in the pursuit of true zero cost like rust is, it is fine to only perform actions on `span_exit` and have the logic usually covered by `span_enter` (starting the timer, assigning an id, etc.) be handled in another way. So a single log method it is. Default methods can and will be added to make a more pleasant API, but that's the start. ### Log Now as to what constitutes a `Log`, at this stage I haven't laid out the full picture, so it's hard to talk about. I do think, however, that delineating `Event`s and `Span`s like `tracing` would be a good idea. `µ/log` doesn't need to care about this because it is Clojure. In a dynamic language it is an expected sort of pattern to say "if this map has `:mulog/root-trace` and `:mulog/duration` it represents a span." You can see that exact example in its [zipkin publisher](https://github.com/BrunoBonacci/mulog/blob/master/mulog-zipkin/src/com/brunobonacci/mulog/publishers/zipkin.clj#L56) The best tool I know of for directly representing this is [sealed interfaces](https://openjdk.org/jeps/409). ```java public sealed interface Log { record Event() implements Log {} record Span() implements Log {} } ``` You can read this as "a `Log` is either an `Event` or a `Span`." Any information common to the two cases will be added to the interface as we move along. ### Level `tracing` has log levels and `µ/log` does not. The way I see it, there are ways to think about levels. 1. Log levels aren't intrinsically special. If you want a standard way to indicate "severity" you can always agree on a standard for your codebase, but it's not something to force on the user. 2. Log levels are intrinsically special. It is a type of metadata you basically always want, and they act as a good first pass for filtering noisy events out of production. I bounced back and forth for a bit, but [Goal #2](#goal-2) tips the scales. Java programmers are used to levels, let them have levels. ```java enum Level { TRACE, DEBUG, INFO, WARN, ERROR } ``` If there are situations that arise in practice where no log level is appropriate a dedicated "unspecified" level could make sense, but I'm putting that in my back pocket for now. ```java enum Level { UNSPECIFIED, TRACE, DEBUG, INFO, WARN, ERROR } ``` If not for [Goal #2](#goal-2) there is no intrinsic reason these need to be the log levels either, but c'est la vie. ```java enum Level { DEVELOPMENT_ONLY, FIX_DURING_WORK_HOURS, WAKE_ME_UP_IN_THE_MIDDLE_OF_THE_NIGHT } ``` I am also choosing to nest this enum inside the `Log` interface. ```java public sealed interface Log { record Event() implements Log {} record Span() implements Log {} enum Level { TRACE, DEBUG, INFO, WARN, ERROR } } ``` This is a stylistic choice more than anything, but I like being able to refer to `Log.Level` as such. Not only does the pattern give one place for discovery - just type `Log.` in your IDE to see the full set of log related types - it can save me from having to add `Log` as a prefix to class names. Both `Event`s and `Span`s should have log levels, so that is represented like so. ```java public sealed interface Log { Level level(); record Event( @Override Level level ) implements Log {} record Span( @Override Level level ) implements Log {} enum Level { TRACE, DEBUG, INFO, WARN, ERROR } } ``` ### Categories I think `µ/log` had the right idea. Most logs should have both a "namespace" and a "name" component as their identifier. It is a pattern I think is pretty common in systems meant to take in structured data. One example: Cloudwatch wants both a ["source" and "detail-type"](https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CloudWatchEventsandEventPatterns.html) for their events. ```json { "version": "0", "id": "6a7e8feb-b491-4cf7-a9f1-bf3703467718", "detail-type": "EC2 Instance State-change Notification", "source": "aws.ec2", "account": "111122223333", "time": "2017-12-22T18:43:48Z", "region": "us-west-1", "resources": [ "arn:aws:ec2:us-west-1:123456789012:instance/ i-1234567890abcdef0" ], "detail": { "instance-id": " i-1234567890abcdef0", "state": "terminated" } } ``` It also gives a convenient analogue to the "class doing the logging" and "log message" that programmers are already used to. I'll call the pair of these two pieces the log's "category." ```java public sealed interface Log { // ... record Category(String namespace, String name) { } } ``` I can't imagine `Event`s having categories but `Span`s not having them so they both get them. ```java public sealed interface Log { Level level(); Category category(); record Event( @Override Level level, @Override Category category ) implements Log {} record Span( @Override Level level, @Override Category category ) implements Log {} // ... record Category(String namespace, String name) { } } ``` ### Logger.Namespaced At this point we can start to sketch out what logging would look like. ```java Logger log = ...; log.log( new Log.Event( Log.Level.INFO, new Log.Category("some.Thing", "thing-happened") ) ); ``` Hideous. So clearly some helper methods are in order. To start, one helper method for logging an event. Spans can come later. ```java /** * A logger. */ public interface Logger { /** * Logs the log. * * @param log The log to log. */ void log(Log log); default void event( Log.Level level, Log.Category category ) { log(new Log.Event(level, category)); } } ``` So now the verbose `new Log.Event` call is not needed. ```java Logger log = ...; log.event( Log.Level.INFO, new Log.Category("some.Thing", "thing-happened") ); ``` Next, it is common to have a dedicated method for each log level. ```java /** * A logger. */ public interface Logger { /** * Logs the log. * * @param log The log to log. */ void log(Log log); default void event( Log.Level level, Log.Category category ) { log(new Log.Event(level, category)); } default void trace( Log.Category category ) { event(Log.Level.TRACE, category); } default void debug( Log.Category category ) { event(Log.Level.DEBUG, category); } default void info( Log.Category category ) { event(Log.Level.INFO, category); } default void warn( Log.Category category ) { event(Log.Level.WARN, category); } default void error( Log.Category category ) { event(Log.Level.ERROR, category); } } ``` ```java Logger log = ...; log.info(new Log.Category("some.Thing", "thing-happened")); ``` This is better still, but the `new Log.Category` still feels like a lot to ask of folks on every single call. For most applications, the namespace component of logs emitted from any particular class is likely to stay constant. It is probably going to be the class name. As such, I think that there should be two types. `Logger` and `Logger.Namespaced`. `Logger` will let folks log while specifying the full `Log.Category` and `Logger.Namespaced` will have the namespace part of the category pre-filled. ```java /** * A logger. */ public interface Logger { /** * Logs the log. * * @param log The log to log. */ void log(Log log); default void event( Log.Level level, Log.Category category ) { log(new Log.Event(level, category)); } // ... default Namespaced namespaced(String namespace) { return new NamespacedLogger(namespace, this); } interface Namespaced { void event(Log.Level level, String name); default void trace( String name ) { event(Log.Level.TRACE, name); } default void debug( String name ) { event(Log.Level.DEBUG, name); } default void info( String name ) { event(Log.Level.INFO, name); } default void warn( String name ) { event(Log.Level.WARN, name); } default void error( String name ) { event(Log.Level.ERROR, name); } } } ``` ```java // The implementation is trivial record NamespacedLogger(String namespace, Logger logger) implements Logger.Namespaced { public void event(Log.Level level, String name) { logger.event( level, new Log.Category(this.namespace, name) ); } } ``` ### Occurrence Before adding support for spans to the logger, we need to take a minor detour. The key difference between an `Event` and a `Span` is that an `Event` happens at a singular point in time while a `Span` takes place over a span of time. This could be represented directly, by having `Event`s track when they happened and having `Span`s track their start time and how long they lasted. ```java public sealed interface Log { Level level(); Category category(); record Event( @Override Level level, @Override Category category, Instant happenedAt ) implements Log {} record Span( @Override Level level, @Override Category category, Instant startedAt, Duration lasted ) implements Log {} // ... } ``` Personally though, I find it more interesting to unify the concepts a little. Every log occurs at some time that is either a singular point of time or a span of time. ```java public sealed interface Log { // ... sealed interface Occurrence { record PointInTime( java.time.Instant happenedAt ) implements Occurrence { } record SpanOfTime( java.time.Instant startedAt, java.time.Duration lasted ) implements Occurrence { } } } ``` This lets us put `Occurrence` on the `Log` interface directly, while still having `Event` and `Span` only have the data they want. ```java public sealed interface Log { Occurrence occurrence(); Level level(); Category category(); record Event( @Override Occurrence.PointInTime occurrence, @Override Level level, @Override Category category ) implements Log {} record Span( @Override Occurrence.SpanOfTime occurrence, @Override Level level, @Override Category category ) implements Log {} // ... sealed interface Occurrence { record PointInTime( java.time.Instant happenedAt ) implements Occurrence { } record SpanOfTime( java.time.Instant startedAt, java.time.Duration lasted ) implements Occurrence { } } } ``` For `Event`s, we can make it a bit easier by adding a constructor which sets the occurrence to the current point in time. ```java public sealed interface Log { Occurrence occurrence(); Level level(); Category category(); record Event( @Override Occurrence.PointInTime occurrence, @Override Level level, @Override Category category ) implements Log { public Event( Level level, Category category ) { this(Instant.now(), level, category); } } record Span( @Override Occurrence.SpanOfTime occurrence, @Override Level level, @Override Category category ) implements Log {} // ... sealed interface Occurrence { record PointInTime( java.time.Instant happenedAt ) implements Occurrence { } record SpanOfTime( java.time.Instant startedAt, java.time.Duration lasted ) implements Occurrence { } } } ``` For `Span`s that doesn't make quite as much sense, so auto-timing logic will probably end up being in the logger. There are two general strategies that would work for adding `Span`s to the `Logger` and `Logger.Namespaced` interfaces. The first is for a method to return something closeable. When that something is closed, the span is finished. ```java public interface Logger { void log(Log log); // ... interface SpanHandle extends AutoCloseable { void close(); } default SpanHandle span( Log.Level level, Log.Category category ) { var start = Instant.now(); return () -> { var end = Instant.now(); log( new Log.Span( new Occurance.SpanOfTime( start, Duration.between(start, end) ) ), level, category ); }; } // ... } ``` ```java try (var handle = log.span( Log.Level.INFO, new Log.Category("something", "happened"))) { // Span open here System.out.println("I am in the span!"); } // Closed when you exit ``` This is convenient to use with the try-with-resources syntax and has the nice benefit of playing well with checked exceptions. By that I mean, if the code within the span wants to throw a checked exception upwards a level or to be handled in a certain way, it is trivial to propagate that exception up. ```java void func() throws IOException, SQLException { try (var handle = log.span( Log.Level.INFO, new Log.Category("something", "happened"))) { codeThatThrowsIOExceptionOrSqlException(); } // Closed when you exit } ``` The other way is to directly take some code to run as a callback. ```java public interface Logger { void log(Log log); // ... default <T> T span( Log.Level level, Log.Category category, Supplier<T> code ) { var start = Instant.now(); try { return code.get(); } catch (Throwable t) { var end = Instant.now(); log( new Log.Span( new Occurance.SpanOfTime( start, Duration.between(start, end) ) ), level, category ); } } default void span( Log.Level level, Log.Category category, Runnable code ) { span(level, category, () -> { code.run(); return null; }); } // ... } ``` This has some major downsides - for one, we now can't do things like return early from a function within a span. ```java void func() { int x = 8; log.span( Log.Level.INFO, new Log.Category("space", "hit-debris"), () -> { if (x == 8) { // Can only return directly from this lambda, // not the whole method. return; } System.out.println("Inside the span!"); } ); System.out.println("Will get here"); } ``` And we also cannot automatically handle checked exceptions in the general case. Like, hypothetically the callback interface could look like this. ```java interface SpanCallback<E extends Throwable> { void run() throws E; } ``` Which would allow us to write code which throws one kind of checked exception. ```java void func() throws IOException { // Correctly throws IOException log.span( Log.Level.INFO, new Log.Category("space", "hit-debris"), () -> { throw new IOException(); } ); } ``` But if code needed to throw multiple disjoint kinds of exceptions, that would be a problem since the `throws E` needs to resolve to a single type. So if the code wrapped in the span throws both `IOException` and `SQLException`, the only common type would be `Exception`. ```java // Would want this to be throws IOException, SQLException... void func() throws Exception { log.span( Log.Level.INFO, new Log.Category("space", "hit-debris"), () -> { if (Math.random() > 0.5) { throw new IOException(); } else { throw new SQLException(); } } ); } ``` There are some solutions depending on sensibilities, but none are perfect. * You can wrap in a `RuntimeException` subclass * [Sneaky throw](https://projectlombok.org/features/SneakyThrows) * [Smuggle the exception](https://mccue.dev/pages/11-1-21-smuggling-checked-exceptions) Even given these issues, I think the callback system is the better of the two. [`ExtentLocal`](https://openjdk.org/jeps/429)s basically mandate it and, as a tiny spoiler, `Span`s are going to need to propagate context to nested logs. It also isn't possible to mess up. try-with-resources is good and IDEs will warn if you don't close a returned auto-closeable, but there are still some cursed situations that can arise if you do not. So with that settled both `Logger` and `Logger.Namespaced` are due their full complement of `span` methods for each log level. ```java public interface Logger { void log(Log log); // ... default <T> T span( Log.Level level, Log.Category category, Supplier<T> code ) { // ... } default void span( Log.Level level, Log.Category category, Runnable code ) { // ... } // ... default <T> T infoSpan( Log.Category category, Supplier<T> code ) { return span(Log.Level.INFO, category, code); } default void infoSpan( Log.Category category, Runnable code ) { span(Log.Level.INFO, category, code); } default <T> T warnSpan( Log.Category category, Supplier<T> code ) { return span(Log.Level.WARN, category, code); } default void warnSpan( Log.Category category, Runnable code ) { span(Log.Level.WARN, category, code); } // ... interface Namepaced { // ... <T> T span( Log.Level level, String name, Supplier<T> code ); default void span( Log.Level level, String name, Runnable code ) { return span(level, name, () -> { code.run(); return null; }); } // ... default <T> T infoSpan( String name, Supplier<T> code ) { return span(Log.Level.INFO, name, code); } default void infoSpan( String name, Runnable code ) { span(Log.Level.INFO, name, code); } default <T> T warnSpan( String name, Supplier<T> code ) { return span(Log.Level.WARN, name, code); } default void warnSpan( String name, Runnable code ) { span(Log.Level.WARN, name, code); } // ... } } ``` ### Entry To be useful, logs often need to carry some dynamic information. What user is making the request? What is the database record that is about to be altered? These are the "log entries." The representation for this seems to be pretty universally key-value pairs, so that is what I am going with. ```java sealed interface Log { // ... record Entry(String key, Value value) {} } ``` It follows that logs will get some list of log entries. ```java public sealed interface Log { Occurrence occurrence(); Level level(); Category category(); List<Entry> entries(); record Event( @Override Occurrence.PointInTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries ) implements Log { public Event( Level level, Category category ) { this( new Occurrence.PointInTime(Instant.now()), level, category ); } } record Span( @Override Occurrence.SpanOfTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries ) implements Log {} // ... record Entry(String key, Value value) {} } ``` `µ/log` merges its log entries immediately into a map, but I don't think this is necessary. For one, we are going to be adding semantics to `Log.Entry` that don't exist in `Map.Entry` like requiring that the key and value be non-null. ```java public sealed interface Log { // ... record Entry(String key, Value value) { public Entry { Objects.requireNonNull( key, "key must not be null" ); Objects.requireNonNull( value, "value must not be null" ); } } } ``` Also, the most common operation that is going to be performed is iterating over the full list of entries to build some JSON or similar. All the pieces of the logs that have semantic significance - like the level or when they occurred - have dedicated methods and places in the objects for them. `µ/log` is also slightly special in that the keys in its maps generally wouldn't be `String`s. It is idiomatic in clojure to use [`keywords`](https://clojuredocs.org/clojure.core/keyword) and [`keywords`](https://clojuredocs.org/clojure.core/keyword) have the distinction of having a precomputed hash code. If the internal representation were a `Map<String, Value>` then there would be a constant and probably pointless cost for doing that hashing. Both `Logger` and `Logger.Namespaced` need to be updated as well. ```java interface Logger { // ... default void info( Log.Category category, List<Log.Entry> entries ) { // ... } default <T> T infoSpan( Log.Category category, List<Log.Entry> entries, Supplier<T> code ) { // ... } // ... interface Namespaced { // ... default void info( String name, List<Log.Entry> entries ) { // ... } default <T> T infoSpan( String name, List<Log.Entry> entries, Supplier<T> code ) { // ... } // ... } } ``` Using `List.of()` to manually make lists is a bit tedious and this parameter does often come at the end of the arguments list, so a varargs overload feels a good idea. ```java interface Logger { // ... interface Namespaced { // ... default void info( String name, List<Log.Entry> entries ) { // ... } default void info( String name, Log.Entry... entries ) { info(name, List.of(entries)); } // ... } } ``` Syntactically we can stop there, but if you take a gander at [the docs for `List.of()`](https://docs.oracle.com/en/java/javase/19/docs/api/java.base/java/util/List.html#of()) you will see everything up to a ten argument overload. This is pretty plainly to avoid allocating the array for varargs, so it might make sense to do that for this API too. Five log levels plus one base `event` method in `Logger` and also in `Logger.Namespaced`. If I gave each all ten argument overloads to match the overloads of `List.of()` that would be extra 120 methods plus carpal tunnel. For the hypothetical performance of something no one is using yet. I'm not going to do that. Which just leaves open the question, what exactly is a `Value`? ### Value Taking a page from `tracing`, I am going to restrict what is allowed as a `Value`. So long as strings, numbers, booleans, lists, and maps are represented anything is probably acceptable, but I am going to be a bit liberal with what is allowed. A good place to start might be to put my foot down and say that booleans are allowed. ```java sealed interface Value { // ... record Boolean(boolean value) implements Value { } // ... } ``` Any of the primitive numeric types are okay. ```java sealed interface Value { // ... record Byte(byte value) implements Value { } record Character(char value) implements Value { } record Short(short value) implements Value { } record Integer(int value) implements Value { } record Long(long value) implements Value { } record Float(float value) implements Value { } record Double(double value) implements Value { } // ... } ``` And `String`s seem pretty cool too. ```java sealed interface Value { // ... record String(java.lang.String value) implements Value { // ... public String { Objects.requireNonNull(value, "value must not be null"); } } // ... } ``` UUIDs are relatively common to come across and are immutable and pretty trivially translatable to a string. ```java sealed interface Value { // ... record UUID(java.util.UUID value) implements Value { public UUID { Objects.requireNonNull(value, "value must not be null"); } } // ... } ``` URIs [(not URLs!)](https://brian.pontarelli.com/2006/12/05/mr-gosling-why-did-you-make-url-equals-suck/) are equally simple and come up quite a bit when making web services. ```java sealed interface Value { // ... record URI(java.net.URI value) implements Value { public URI { Objects.requireNonNull(value, "value must not be null"); } } // ... } ``` The types in `java.time` are pretty crucial. ```java sealed interface Value { // ... record Instant(java.time.Instant value) implements Value { public Instant { Objects.requireNonNull(value, "value must not be null"); } } record LocalDateTime(java.time.LocalDateTime value) implements Value { public LocalDateTime { Objects.requireNonNull(value, "value must not be null"); } } record LocalDate(java.time.LocalDate value) implements Value { public LocalDate { Objects.requireNonNull(value, "value must not be null"); } } record LocalTime(java.time.LocalTime value) implements Value { public LocalTime { Objects.requireNonNull(value, "value must not be null"); } } record Duration(java.time.Duration value) implements Value { public Duration { Objects.requireNonNull(value, "value must not be null"); } } // ... } ``` And `Exception`s must be in the top 10 things to want to log. ```java sealed interface Value { // ... record Throwable(java.lang.Throwable value) implements Value { public Throwable { Objects.requireNonNull(value, "value must not be null"); } } // ... } ``` Logging Lists is one of the things that we need to support to keep parity with `µ/log` and `tracing` ```java sealed interface Value { // ... record List(java.util.List<Value> value) implements Value { } // ... } ``` At which point, we need to talk about null. In all the cases so far, I've added an `Objects.requireNonNull` to the canonical constructor for each of these cases. To me this tracks because it wouldn't make much sense to have both a "String" and an "Instant" with null values allowed. ```java // Should this be true or false? // If it should be true that will require some wacky equals method. Objects.equals( new Log.Entry.Value.String(null), new Log.Entry.Value.Instant(null), ) ``` The problem is, people end up with null values pretty much constantly. Logging that you are about to perform some operation on an entity and that entity is unexpectedly null is unbelievably common. ```java // Sorry, we crashed because of a log! new Log.Entry.Value.String(s); ``` To remedy this, I could add constructor functions to `Log.Entry.Value` which automatically handle null values. ```java sealed interface Value { // ... static Value.String of(java.lang.String value) { if (value == null) { return null; } else { return new String(value); } } record String(java.lang.String value) implements Value { // ... public String { Objects.requireNonNull(value, "value must not be null"); } } // ... } ``` For the primitive types it will be a little pointless, but it can help avoid errors with their wrappers. ```java sealed interface Value { // ... static Value.Boolean of(boolean value) { return new Boolean(value); } static Value.Boolean of(java.lang.Boolean value) { if (value == null) { return null; } else { return new Boolean(value); } } record Boolean(boolean value) implements Value { } // ... } ``` But with lists, there is another foot gun we have just loaded. Lists made with `List.of` do not support `null` elements, so it is more than likely folks will end up with seemingly okay code and an inexplicable crash. ```java // If null, will still crash! Log.Entry.Value.of(List.of( Log.Entry.Value.of(someString) )) ``` So to sidestep this, we need to make our own null. It feels contrived, I know. ```java sealed interface Value { enum Null implements Value { INSTANCE; @Override public java.lang.String toString() { return "Null"; } } } ``` Then for all the constructor functions, this becomes the fallback. ```java sealed interface Value { // ... static Value of(java.lang.String value) { if (value == null) { return Null.INSTANCE; } else { return new String(value); } } enum Null implements Value { INSTANCE; @Override public java.lang.String toString() { return "Null"; } } record String(java.lang.String value) implements Value { // ... public String { Objects.requireNonNull(value, "value must not be null"); } } // ... } ``` Which solves our current issue fairly neatly. ```java String s1 = "abc"; String s2 = null; // Our "Null" isn't "null", so all is well. Log.Entry.Value.of(List.of( Log.Entry.Value.of(s1), Log.Entry.Value.of(s2) )) ``` The remaining kinds of data to consider are maps and sets. Both would have had the same issues with null and their convenient constructor functions in `Map.of()` and `Set.of()` so it is good to have that resolved. ```java sealed interface Value { // ... record Map(java.util.Map<java.lang.String, Value> value) implements Value { public Map(java.util.Map<java.lang.String, Value> value) { Objects.requireNonNull(value, "value must not be null"); this.value = value.entrySet() .stream() .collect(Collectors.toUnmodifiableMap( java.util.Map.Entry::getKey, entry -> entry.getValue() == null ? Null.INSTANCE : entry.getValue() )); } } record Set(java.util.Set<Value> value) implements Value { public Set(java.util.Set<Value> value) { Objects.requireNonNull(value, "value must not be null"); this.value = value.stream() .map(v -> v == null ? Null.INSTANCE : v) .collect(Collectors.toUnmodifiableSet()); } } } ``` For `Map`s there is no intrinsic reason it has to be this way, but I chose to restrict the keys to be `String`s. This is both more convenient for eventual serialization to JSON and avoids issues like having two keys which would serialize to the same form if string-ified. ```java // This would be annoying to handle in JSON Log.Entry.Value.of(Map.of( Log.Entry.Value.of(123), Log.Entry.Value.of("abc"), Log.Entry.Value.of("123"), Log.Entry.Value.of("def"), )) ``` Now for the last value kind, I promise. Occasionally producing the value for a log might be either expensive, intrinsically fallible, or both - like fetching a value from a remote server. For this, we want to provide a way to lazily compute a value. I took the implementation of this from a combo of [vavr's Lazy](https://javadoc.io/doc/io.vavr/vavr/0.9.2/io/vavr/Lazy.html) and [Clojure's delay](https://clojuredocs.org/clojure.core/delay). ```java sealed interface Value { // ... final class Lazy implements Value { // Implementation based off of clojure's Delay + vavr's Lazy private volatile Supplier<? extends Value> supplier; private Value value; public Lazy(Supplier<? extends Value> supplier) { Objects.requireNonNull( supplier, "supplier must not be null" ); this.supplier = supplier; this.value = null; } public Value value() { if (supplier != null) { synchronized (this) { final var s = supplier; if (s != null) { try { this.value = Objects.requireNonNullElse( s.get(), Null.INSTANCE ); } catch (java.lang.Throwable throwable) { this.value = new Throwable(throwable); } this.supplier = null; } } } return this.value; } @Override public java.lang.String toString() { if (supplier != null) { return "Lazy[pending]"; } else { return "Lazy[realized: value=" + value() + "]"; } } } } ``` And this gets its own `ofLazy` factory functions of course. ```java sealed interface Value { // ... static Value ofLazy(Supplier<Value> valueSupplier) { return new Value.Lazy(valueSupplier); } static <T> Value ofLazy(T value, Function<T, Value> toValue) { return new Value.Lazy(() -> { var v = toValue.apply(value); return v == null ? Value.Null.INSTANCE : v; }); } // ... } ``` Now to make a log it will look like this. ```java log.info("dog-barked", new Log.Entry( "name", Log.Entry.Value.of("gunny") )); ``` Which is still a bit too verbose, so for all of those Value constructor functions I will add a matching one in `Log.Entry`. ```java log.info("dog-barked", Log.Entry.of("name", "gunny")); ``` Which is finally terse enough that I could believe your average Jane or Joe writing it. ### Flake Now a bit of a roundup, I am giving logs the `Flake` from `µ/log`. I actually copied the class exactly as written. ```java public sealed interface Log { Flake flake(); Occurrence occurrence(); Level level(); Category category(); List<Entry> entries(); record Event( @Override Flake flake, @Override Occurrence.PointInTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries ) implements Log { public Event( Level level, Category category ) { this( Flake.create(), new Occurrence.PointInTime(Instant.now()), level, category ); } } record Span( @Override Flake flake, @Override Occurrence.SpanOfTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries ) implements Log {} // ... } ``` ### Thread Apparently, getting the current thread is [basically free](https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/5069) and since the ultimate goal is to allow sending logs around to different threads than when they originated gathering that for metadata feels appropriate. ```java public sealed interface Log { Thread thread(); Flake flake(); Occurrence occurrence(); Level level(); Category category(); List<Entry> entries(); record Event( @Override Thread thread, @Override Flake flake, @Override Occurrence.PointInTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries ) implements Log { public Event( Level level, Category category ) { this( Thread.currentThread(), Flake.create(), new Occurrence.PointInTime(Instant.now()), level, category ); } } record Span( @Override Thread thread, @Override Flake flake, @Override Occurrence.SpanOfTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries ) implements Log {} // ... } ``` ### Outcome When a span finishes, `µ/log` records whether that span threw an exception and if so what the exception was. This is very doable, we just need to make sure to catch and re-throw anything thrown when performing work a span. ```java sealed interface Log { record Span( @Override Thread thread, @Override Flake flake, Outcome outcome, @Override Occurrence.SpanOfTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries) implements Log { public sealed interface Outcome { enum Ok implements Outcome { INSTANCE; @Override public String toString() { return "Ok"; } } record Error(Throwable throwable) implements Outcome { } } } } ``` If it weren't for this, `Span`s and `Event`s could probably be a single class. Unfortunately there isn't a neat value to put for an `Outcome` in an `Event`. ### Context The last thing to worry about is context. We need some mechanism for context to propagate from span to span and across method call boundaries and we need to define what exactly is allowed to be present in context. The easiest to consider is global context. All we need is some `Log.Entry`s which will be included in every log. ```java sealed interface Log { sealed interface Context { record Global(List<Entry> entries) implements Context { } } } ``` I don't see any reason to be as strict as `tracing` is when it comes to setting global context more than once, so using `µ/log`'s storage strategy of an `AtomicReference` is likely good enough. ```java // Separate, internal, class to avoid exposing class Globals { static final AtomicReference<Log.Context.Global> GLOBAL_CONTEXT = new AtomicReference<>(new Log.Context.Global(List.of())); } ``` And at any point it will be "safe" - if maybe leading to strange semantics - to set this context. ```java sealed interface Log { // ... static void setGlobalContext(List<Entry> entries) { GLOBAL_CONTEXT.set(new Context.Global(entries)); } // ... } ``` There are then two other kinds of "child" context. The first is "plain" context which is intended to just carry log entries down. The second is "span" context which doesn't carry entries but instead metadata about the current span. ```java sealed interface Log { sealed interface Context { record Global(List<Entry> entries) implements Context { } sealed interface Child extends Context { Context parent(); record Plain( List<Entry> entries, @Override Context parent ) implements Child { } record Span( Thread thread, Instant startedAt, Flake spanId, @Override Context parent ) implements Child { } } } } ``` Both `Plain` and `Span` child contexts have a field linking to their parent context. This makes this a strange and wonderful kind of linked list. To access the immediate parent or the root span, you can just crawl the linked list. This means that unlike `µ/log` we don't need to explicitly pass down a `:mulog/root-trace` or `:mulog/parent-trace`. We just have to [accept linked lists with all their flaws](https://rust-unofficial.github.io/too-many-lists/). ```java sealed interface Log { sealed interface Context { Optional<Child.Span> parentSpan(); default Optional<Child.Span> rootSpan() { return this.parentSpan() .map(parent -> parent.rootSpan().orElse(parent)); } record Global(List<Entry> entries) implements Context { @Override public Optional<Child.Span> parentSpan() { return Optional.empty(); } } sealed interface Child extends Context { Context parent(); @Override default Optional<Span> parentSpan() { var parent = parent(); if (parent instanceof Span parentSpan) { return Optional.of(parentSpan); } else { return parent.parentSpan(); } } record Plain( List<Entry> entries, @Override Context parent ) implements Child { } record Span( Thread thread, Instant startedAt, Flake spanId, @Override Context parent ) implements Child { } } } } ``` Now to propagate this non-global context a `ThreadLocal` will be used. Yes, I've mentioned many times how I want this design to be forward compatible with `ExtentLocal`s, but getting those early access builds is a bit hard, and I don't want to lock people who want to toy with this API today to have to figure that out. ```java class Globals { static final AtomicReference<Log.Context.Global> GLOBAL_CONTEXT = new AtomicReference<>(new Log.Context.Global(List.of())); /* * This should be an extent local when it is possible to be so. */ static final ThreadLocal<Log.Context.Child> LOCAL_CONTEXT = new ThreadLocal<>(); } ``` Since the local context will contain a pointer to the global context that was established when it was formed, to get the current context we just need to check if there is a currently bound local context and if so use it. If not, we take from the global context. ```java sealed interface Log { sealed interface Context { static Context current() { var localContext = LOCAL_CONTEXT.get(); return localContext == null ? GLOBAL_CONTEXT.get() : localContext; } // ... } } ``` And every log will have some attached context. ```java public sealed interface Log { Context context(); Thread thread(); Flake flake(); Occurrence occurrence(); Level level(); Category category(); List<Entry> entries(); record Event( @Override Context context, @Override Thread thread, @Override Flake flake, @Override Occurrence.PointInTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries ) implements Log { public Event( Level level, Category category ) { this( Context.current(), Thread.currentThread(), Flake.create(), new Occurrence.PointInTime(Instant.now()), level, category ); } } record Span( @Override Context context, @Override Thread thread, @Override Flake flake, Outcome outcome, @Override Occurrence.SpanOfTime occurrence, @Override Level level, @Override Category category, @Override List<Entry> entries ) implements Log { public Event( Outcome outcome, Occurrence.SpanOfTime occurrence, Level level, Category category, List<Entry> entries ) { this( Context.current(), Thread.currentThread(), Flake.create(), outcome occurrence, level, category ); } } // ... } ``` So to add some log entries to every log made in a scope, we just need to set and unset the thread local. ```java sealed interface Log { // ... static <T> T withContext(List<Entry> entries, Supplier<T> code) { var localContext = LOCAL_CONTEXT.get(); try { LOCAL_CONTEXT.set(new Context.Child.Plain( entries, localContext == null ? GLOBAL_CONTEXT.get() : localContext )); return code.get(); } finally { LOCAL_CONTEXT.set(localContext); } } static void withContext(List<Entry> entries, Runnable code) { withContext( entries, () -> { code.run(); return null; }); } // ... } ``` ```java Log.withContext( List.of(Log.Entry.of("request-id", "abc")), () -> { log.info("has-request-id!"); } ) ``` And the strategy is very similar for propagating spans, with the difference that log entries are not propagated just metadata. ```java interface Logger { // ... default <T> T span( Log.Level level, Log.Category category, List<Log.Entry> entries, Supplier<T> code ) { Log.Span.Outcome outcome = Log.Span.Outcome.Ok.INSTANCE; var start = Instant.now(); var localContext = LOCAL_CONTEXT.get(); try { LOCAL_CONTEXT.set(new Log.Context.Child.Span( Thread.currentThread(), start, Flake.create(), localContext == null ? GLOBAL_CONTEXT.get() : localContext )); return code.get(); } catch (Throwable t) { outcome = new Log.Span.Outcome.Error(t); throw t; } finally { LOCAL_CONTEXT.set(localContext); var end = Instant.now(); var duration = Duration.between(start, end); var occurrence = new Log.Occurrence.SpanOfTime( start, duration ); log(new Log.Span( outcome, occurrence, level, category, entries )); } } // ... } ``` Now putting it all together. ```java Log.setGlobalContext(List.of( Log.Entry.of( "os.name", System.getProperty("os.name") )); // ... log.infoSpan( "handling-request", () -> { Log.withContext( List.of(Log.Entry.of("request-id", "abc")), () -> { log.warn( "oh-no!", Log.Entry.of("failed-for-id", 123) ); } ) } ); ``` Looking past the lambdas, which are only taking so much visual budget because of the trivial-ness of the example, I am pretty happy with this API. The innermost log will have both the `request-id` and `os-name` available within its context as well as the `failed-for-id` directly in its `entries` component. Crawling for all the log entries available in the entire linked list for a log is slightly non-trivial, so I think it would be beneficial for `Log` itself to implement `Iterable\<Log>` and do that crawling upon request. A lot of the code so far has been ugly on the outside. I think this code is ugly on the inside too. ```java sealed interface Log extends Iterable<Log.Entry> { // ... @Override default Iterator<Entry> iterator() { return new Iterator<>() { Iterator<Entry> iter = entries().iterator(); Context ctx = context(); @Override public boolean hasNext() { if (iter.hasNext()) { return true; } else { if (ctx instanceof Context.Child.Plain plainCtx) { iter = plainCtx.entries().iterator(); ctx = plainCtx.parent(); return this.hasNext(); } else if (ctx instanceof Context.Child.Span spanCtx) { ctx = spanCtx.parent(); return this.hasNext(); } else if (ctx instanceof Context.Global globalCtx) { iter = globalCtx.entries().iterator(); ctx = null; return this.hasNext(); } else { return false; } } } @Override public Entry next() { if (iter.hasNext()) { return iter.next(); } else { if (ctx instanceof Context.Child.Plain plainCtx) { iter = plainCtx.entries().iterator(); ctx = plainCtx.parent(); return this.next(); } else if (ctx instanceof Context.Child.Span spanCtx) { ctx = spanCtx.parent(); return this.next(); } else if (ctx instanceof Context.Global globalCtx) { iter = globalCtx.entries().iterator(); ctx = null; return this.next(); } else { throw new NoSuchElementException(); } } } }; } } ``` ### LoggerFactory Now to have parity with `SLF4J` we should be able to delegate logging to a specific implementation on the class/module-path. To do this, we first need to make an interface for creating loggers. ```java interface LoggerFactory { Logger createLogger(); } ``` And then we need to declare in our `module-info.java` that we are interested in consuming external implementors of this interface - just in case people ever decide to put this on the module-path. ```java import dev.mccue.log.alpha.LoggerFactory; module dev.mccue.log.alpha { exports dev.mccue.log.alpha; uses LoggerFactory; } ``` Then we can build a factory function which scans through available implementations and picks one. ```java public interface LoggerFactory { static LoggerFactory create() { var loggerFactories = ServiceLoader.load(LoggerFactory.class).iterator(); if (!loggerFactories.hasNext()) { System.err.println( "No logger factory supplied. Falling back to no-op logger" ); return () -> (__) -> { }; } else { var service = loggerFactories.next(); if (loggerFactories.hasNext()) { var services = new ArrayList<LoggerFactory>(); services.add(service); while (loggerFactories.hasNext()) { services.add(loggerFactories.next()); } System.err.printf( "Multiple logger factories supplied: %s. Picking one at random.%n", services ); return services.get(ThreadLocalRandom.current().nextInt(0, services.size())); } else { return service; } } } Logger createLogger(); } ``` The reason for the indirection - using a `LoggerFactory` instead of a `Logger` - is to allow for implementations to do some one-time set up logic and potentially set up logic per created logger. In the context of Java, the namespace of a log would often be the name of the class that the log is in. This is why SLF4J has `LoggerFactory.getLogger(Class<?>)` as [the obvious way to get a logger](https://www.slf4j.org/manual.html). Matching that convention is easy enough. ```java public interface LoggerFactory { static LoggerFactory create() { // ... } static Logger getLogger() { return create().createLogger(); } static Logger.Namespaced getLogger(String namespace) { return getLogger().namespaced(namespace); } static Logger.Namespaced getLogger(Class<?> klass) { return getLogger().namespaced(klass.getCanonicalName()); } Logger createLogger(); } ``` So now constructing a logger will look a lot like SLF4J ```java public final class Main { private static final Logger.Namespaced log = LoggerFactory.getLogger(Main.class); public static void main(String[] args) { log.info( "item-delivered", Log.Entry.of("cost", "everything") ); } } ``` ### Generation There will probably be an itch for some to generate the `getLogger` call [like what lombok does for other loggers](https://projectlombok.org/features/log). I can't add to lombok, I have a life, but I can [generate the code with an annotation processor](https://mccue.dev/pages/1-23-22-code-generation). ```java @DeriveLogger public final class Main implements MainLog { public static void main(String[] args) { log.info( "item-delivered", Log.Entry.of("cost", "everything") ); } } ``` ```java // ~= to what is generated. sealed interface MainLog permits Main { Logger.Namespaced log = LoggerFactory.getLogger(Main.class); } ``` ## What was the point of all this? As [someone](https://www.reddit.com/r/programming/comments/xnt17g/comment/ipvb6md/?utm_source=share&utm_medium=web2x&context=3) correctly pointed out to me, I never really clarified why any of this is "better". The gist of it is that this API allows and enforces [structured logging](https://docs.newrelic.com/docs/more-integrations/open-source-telemetry-integrations/opentelemetry/best-practices/opentelemetry-best-practices-logs/). That is, because there is a data-ified representation of events that happen within your system you can trivially transform them into structured formats such as JSON. When your logs are all JSON you can use tools like [Cloudwatch metric insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/query_with_cloudwatch-metrics-insights.html) to directly search over your data. ```sql fields @timestamp, @message | sort @timestamp desc | filter uri = '/something' | limit 20 ``` The exact methodology for this will vary from service to service, but the overarching theme is that structured data is searchable. Text data is "grep-able". Logically when you log something in a classical framework you take a representation in memory and turn it into some "English." Then when you inspect those logs you need to undo that transformation to English to do searching. Because its 2022 there is no chance you could search those logs by hand with any reasonable scale, so you have to fall back to taking the part of your log message that is a constant and searching for that. So with structured logging, you can skip the english and ship data directly to your observability platforms. This is just a potential API for structured logging or, more accurately, what `tracing` calls "in process tracing" Let me know if this explanation doesn't track in the comments. ## What Now? Well, I am pretty confident that the API is good enough to start experimenting with. You can find the code for that [here](https://github.com/bowbahdoe/log.alpha). The annotation processor is also simple enough that I think it can be used right now. Code for that is [here](https://github.com/bowbahdoe/log.alpha.generate). Both of those finished-ish components can be fetched from jitpack. ```xml <repositories> <repository> <id>jitpack.io</id> <url>https://jitpack.io</url> </repository> </repositories> <dependencies> <dependency> <groupId>dev.mccue</groupId> <artifactId>log.alpha</artifactId> <version>main-SNAPSHOT</version> </dependency> <dependency> <groupId>dev.mccue</groupId> <artifactId>log.alpha.generate</artifactId> <version>main-SNAPSHOT</version> </dependency> </dependencies> ``` Everything else - like my rough draft JSON logger, publisher system implementation, etc - currently live in these repos. * [log.alpha.jackson](https://github.com/bowbahdoe/log.alpha.jackson) * [log.alpha.simplejsonlogger](https://github.com/bowbahdoe/log.alpha.simplejsonlogger) * [log.alpha.publisher](https://github.com/bowbahdoe/log.alpha.publisher) Contributions very welcome. Feel free to reach out to me directly on discord or similar. ### Next Steps * Publisher system implementation A lot of design decisions I made were in support of a hypothetical system where `µ/log`'s publisher scheme was translated. I wrote this up before finishing that prototype because I saw it would be a lot of work, and I didn't want to take the dive if there was no actual interest. * SLF4J Bridge I have a sketch of what one could look like that turns `SLF4J` logs into records with `slf4j/message`, `slf4j/arguments`, and `slf4j/mdc`. This technically works in terms of information conveyance but isn't very pretty. I bet someone knows how to do better. * Console publisher `tracing` proves that you can have structured logging and still pretty, developer friendly console logs. Everything I make is a Shrek so I probably need help to pull that off. * Tests I didn't test any of this. If you can look me in the eye and say you would have written unit tests for all the log entry `of` functions I commend you. Some stuff might need to change to allow for unit testing applications which care about asserting that logs happen in the form they expect, but I haven't gone down that rabbit hole yet. * Benchmarks I went at this maybe a bit too much by feel. I have no clue about the performance of this API. Is getting the current thread every time an issue? Should I have added predicates to check if log levels are enabled? Make those hundreds of logger overloads? No clue. I *should* break out JMH or do some profiling. * Better Docs If I said I didn't have the opportunity to write better docs that would be a lie. I spent that time watching [Letterkenny](https://www.youtube.com/watch?v=FC7a2uE-9_o). This whole thing probably counts as an explainer, but for there to be any chance of anyone using this the [rest of Diátaxis](https://diataxis.fr/) could use some attention. In my head this would take the form of fleshed out reference Javadocs, a tutorial or two, and some how-to guides on managing a migration from SLF4J. * Real world usage I need brave souls - whom I would love with all my heart, mind, and body - to try this API out in some real world applications. That is probably the only way to actually validate or invalidate any choices. --- If nothing else, I hope this got some of you into the same nightmare head-space I'm trapped in. Leave unconstructive criticism in the comments below.Sun, 25 Sep 0022 05:00:00 +0000Turn any Java program into a self-contained EXEhttps://mccue.dev/pages/7-28-22-make-an-exe Double-click to run is one of the easiest ways to open a program. If the person you are sharing code with already has the right version of Java installed, they can double-click on a jar file to run it. You wrote it once, they can run it there. If they don't have Java installed, then there are ways to create a runnable installer like [jpackage](https://docs.oracle.com/en/java/javase/18/docs/specs/man/jpackage.html), but now they have to click through an installer to be able to run your code. You can use [Native Image](https://www.graalvm.org/22.1/reference-manual/native-image/) to turn your code into an exe which won't require them to have anything installed, but now you have to abide by the [closed world assumption](https://docs.oracle.com/en/learn/understanding-reflection-graalvm-native-image/index.html#step-2-the-closed-world-assumption) and that's not always easy or possible. So this post is going to focus on a fairly oonga boonga approach that will work for any app, regardless of what dependencies you include or JVM features you make use of. The code along with an example GitHub workflow can be found in [this repo](https://github.com/bowbahdoe/java-exe-example) and final executables can be found [here](https://github.com/bowbahdoe/java-exe-example/releases/tag/v0.0.8). ## Prerequisites ### Java 9+ ```markdown java --version jlink --version ``` ### Maven ```markdown mvn --version ``` ### NodeJS ```markdown npx --version ``` ## Step 1. Compile and Package your code into a jar. This toy program will create a basic window that has some text that you can toggle between being capitalized. ```java package example; import org.apache.commons.text.WordUtils; import javax.swing.*; import java.awt.*; public class Main { public static void main(String[] args) { var label = new JLabel("Hello, World!"); label.setFont(new Font("Serif", Font.PLAIN, 72)); var uppercaseButton = new JButton("Uppercase"); uppercaseButton.addActionListener(e -> label.setText(WordUtils.capitalize(label.getText())) ); var lowercaseButton = new JButton("lowercase"); lowercaseButton.addActionListener(e -> label.setText(WordUtils.uncapitalize(label.getText())) ); var panel = new JPanel(); panel.setLayout(new BoxLayout(panel, BoxLayout.Y_AXIS)); panel.add(label); panel.add(uppercaseButton); panel.add(lowercaseButton); var frame = new JFrame("Basic Program"); frame.add(panel); frame.pack(); frame.setVisible(true); frame.setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE); } } ``` <img src="/pages/7-28-22-gif.gif" alt="Program Demonstration"></img> The goal is to package up your code, along with its dependencies, into a jar. [Jars are just zip files with a little extra structure](https://docs.oracle.com/javase/7/docs/technotes/guides/jar/jar.html). For a [Maven](https://maven.apache.org/) project the configuration will look like the following. ```xml <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>example</groupId> <artifactId>javaexe</artifactId> <version>1.0</version> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <maven.compiler.source>18</maven.compiler.source> <maven.compiler.target>18</maven.compiler.target> </properties> <dependencies> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-text</artifactId> <version>1.9</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.4.3</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <manifestEntries> <Main-Class>example.Main</Main-Class> <Build-Number>1.0</Build-Number> </manifestEntries> </transformer> </transformers> </configuration> </execution> </executions> </plugin> </plugins> </build> </project> ``` Where the "shade" plugin will handle including the code from all of your dependencies into the jar. In this case, the only external dependency is [`org.apache.commons/commons-text`](https://commons.apache.org/proper/commons-text/). ```markdown mvn clean package ``` Then for the purposes of this guide we will move that jar into a new directory where it will be separate from whatever other files are in `target/`. ```markdown mkdir build mv target/javaexe-1.0.jar build ``` ## Step 2. Create a Java Runtime Environment In order to run the jar from the previous step, we will need to bundle it with a Java Runtime Environment. To do this we will use [`jlink`](https://docs.oracle.com/en/java/javase/18/docs/specs/man/jlink.html). Since [the Java ecosystem hasn't embraced modules](https://blog.frankel.ch/update-state-java-modularization/), you most likely haven't heard of or used [`jlink`](https://docs.oracle.com/en/java/javase/18/docs/specs/man/jlink.html). The short pitch is that it can create "custom runtime images." Say you are making a web server. You don't need AWT or Swing, so including all the code for that is a tad wasteful. With [`jlink`](https://docs.oracle.com/en/java/javase/18/docs/specs/man/jlink.html) you can make a JRE that doesn't include the [`java.desktop`](https://docs.oracle.com/en/java/javase/11/docs/api/java.desktop/module-summary.html) module at all. This system works best if your application and all of its dependencies include compiled `module-info.java` files which let [`jlink`](https://docs.oracle.com/en/java/javase/18/docs/specs/man/jlink.html) know exactly what modules you want to include. You can also manually figure out the list of required modules by using [`jdeps`](https://docs.oracle.com/en/java/javase/18/docs/specs/man/jdeps.html) and a bit of detective work. Even without a modular project though, we can still use [`jlink`](https://docs.oracle.com/en/java/javase/18/docs/specs/man/jlink.html) to effectively clone our Java installation to a directory. ```markdown jlink --add-modules ALL-MODULE-PATH --output build/runtime ``` Including every module gives confidence that libraries like [`org.apache.commons/commons-text`](https://commons.apache.org/proper/commons-text/) will work as intended, even though we never figured out what modules they actually require. ## Step 3. Bundle the Jar and the JRE into an executable So with a jar containing our code and all of its dependencies in one hand and a JRE in the other, all that's left is to stitch the two together. The general technique for that is to 1. Zip up the directory containing the JRE and your application jar. 2. Attach a stub script to the top of that zip file which will extract the zip to a temporary directory and run the code. There is a JavaScript library which does this called [caxa](https://github.com/leafac/caxa). Its purpose is making NodeJS projects into executables, so it will also bundle whatever NodeJS installation is on the system. That step can luckily be skipped by passing the `--no-include-node` flag, so it will work just fine for this. ```markdown npx caxa \ --input build \ --output application \ --no-include-node \ -- "{{caxa}}/runtime/bin/java" "-jar" "{{caxa}}/javaexe-1.0.jar" ``` This will create an executable called "`application`." If you are doing this for Windows you should specify "`application.exe`." When the executable is run the `{{caxa}}`s in the command will be substituted for to the temporary directory where the zip file was expanded. --- I am aware of [jdeploy](https://www.jdeploy.com/) - and it does handle stuff that I didn't cover or would be relatively hard with this scheme like code signing or automatic updates - but as far as I can tell it still [requires that users run an installer](https://www.jdeploy.com/docs/manual/#_the_installer). On code signing, there [is an open issue with caxa](https://github.com/leafac/caxa/issues/40) to figure out how to do that. I can make another post or update this one if an approach is figured out. I don't quite understand the issue, so I don't feel qualified to comment. If any mildly ambitious reader wants to try their hand at making [caxa](https://github.com/leafac/caxa) in a different language so this process isn't dependent on the JS ecosystem I encourage it. As always, comments and corrections welcome. Thu, 28 Jul 0022 05:00:00 +0000The different ways to handle errors in Chttps://mccue.dev/pages/7-27-22-c-errors C doesn't have a single clear way to handle errors. The tutorials out there are [pretty](https://www.geeksforgeeks.org/error-handling-c-programs/) [much](https://www.studytonight.com/c/error-handling-in-c.php) [garbage](https://www.tutorialspoint.com/cprogramming/c_error_handling.htm) too. So for this post, we are going to work with the toy example of a function that parses natural numbers from a string and go through the different approaches. Code samples can be found in a compilable state [here](https://github.com/bowbahdoe/c-error-examples). ## 1. The Ostrich Algorithm This might sound silly, but how often are you really going to run out of memory? If an error condition is rare enough, you can always just dig your head in the sand and choose to ignore the possibility. <img src="/pages/7-27-22-ostrich.png" alt="Ostrich burying its head in sand"></img> This can make code a lot prettier, but at the cost of robustness. ```c #include <stdio.h> int parse_natural_base_10_number(const char* s) { int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { parsed *= 10; parsed += s[i] - '0'; } return parsed; } int main() { printf("Expecting garbage or crash on bad values\n"); const char* examples[] = { "10", "foo", "42", "" }; for (size_t i = 0; i < 4; i++) { const char* example = examples[i]; int parsed = parse_natural_base_10_number(example); printf("parsed: %d\n", parsed); } return 0; } ``` ```markdown Expecting garbage or crash on bad values parsed: 10 parsed: 6093 parsed: 42 parsed: 0 ``` A real world example of this can be seen [with the firmware for flipper devices'](https://github.com/flipperdevices/flipperzero-firmware/blob/dev/applications/rpc/rpc_storage.c#L117) use of [`malloc`](https://cplusplus.com/reference/cstdlib/malloc/). ## 2. Crash. Sometimes errors aren't practically recoverable. Most applications should probably just give up when [`malloc`](https://cplusplus.com/reference/cstdlib/malloc/) returns `NULL`. If you are sure that there isn't a way to recover from an error condition and that a caller won't want to handle it in any other way, you can just print a message saying what went wrong and exit the program. ```c #include <stdio.h> #include <stdlib.h> int parse_natural_base_10_number(const char* s) { int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { printf( "Got a bad character ('%c') in %s, crashing.", s[i], s ); exit(1); } else { parsed *= 10; parsed += s[i] - '0'; } } return parsed; } int main() { const char* examples[] = { "10", "42", "foo" }; for (size_t i = 0; i < 3; i++) { const char* example = examples[i]; int parsed = parse_natural_base_10_number(example); printf("parsed: %d\n", parsed); } return 0; } ``` ```markdown parsed: 10 parsed: 42 Got a bad character ('f') in foo, crashing. ``` You can see this approach [in the code of OpenBLAS](https://github.com/xianyi/OpenBLAS/blob/develop/benchmark/geev.c#L100-122). ## 3. Return a negative number. If the function normally would return a natural number, then you can use a negative number to indicate a failure. This is applicable both to our toy example and cases like returning the number of bytes read from a file. If there are different kinds of errors for this sort of case you could also use specific negative numbers to indicate the different categories. ```c #include <stdio.h> int parse_natural_base_10_number(const char* s) { int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { return -1; } else { parsed *= 10; parsed += s[i] - '0'; } } return parsed; } int main() { const char* examples[] = { "10", "foo", "42" }; for (size_t i = 0; i < 3; i++) { const char* example = examples[i]; int parsed = parse_natural_base_10_number(example); if (parsed < 0) { printf("failed: %s\n", example); } else { printf("worked: %d\n", parsed); } } return 0; } ``` ```markdown worked: 10 failed: foo worked: 42 ``` You can see examples of this [in the Linux kernel.](https://github.com/torvalds/linux/blob/master/tools/testing/selftests/net/ipsec.c#L131-151) ## 4. Return NULL If the function would normally return a pointer, then you can use `NULL` to indicate that something went wrong. Most functions that would be returning pointers will be doing heap allocation in order for that to be sound, so this scheme is likely not applicable when you want to avoid allocations. Also, lets be real, feels silly to heap allocate an int. ```c #include <stdio.h> #include <stdlib.h> int* parse_natural_base_10_number(const char* s) { int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { return NULL; } else { parsed *= 10; parsed += s[i] - '0'; } } int* result = malloc(sizeof (int)); *result = parsed; return result; } int main() { const char* examples[] = { "10", "foo", "42" }; for (size_t i = 0; i < 3; i++) { const char* example = examples[i]; int* parsed = parse_natural_base_10_number(example); if (parsed == NULL) { printf("failed: %s\n", example); } else { printf("worked: %d\n", *parsed); } free(parsed); } return 0; } ``` ```markdown worked: 10 failed: foo worked: 42 ``` A real world example of this scheme is [`malloc`](https://cplusplus.com/reference/cstdlib/malloc/). If malloc fails to allocate memory, then instead of returning a pointer to newly allocated memory it will return a null pointer. ## 5. Return a boolean and take an out param One of the less obvious things you can do in C is to have one or more of a function's arguments "out params". This means that it is part of the contract of the function that it will write into the memory behind a pointer. If a function can fail, a natural translation of this can be to return a boolean indicating whether it did and pass an out param that you only inspect when true is returned. ```c #include <stdio.h> #include <stdbool.h> bool parse_natural_base_10_number(const char* s, int* out) { int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { return false; } else { parsed *= 10; parsed += s[i] - '0'; } } *out = parsed; return true; } int main() { const char* examples[] = { "10", "foo", "42" }; for (size_t i = 0; i < 3; i++) { const char* example = examples[i]; int parsed; bool success = parse_natural_base_10_number( example, &parsed ); if (!success) { printf("failed: %s\n", example); } else { printf("worked: %d\n", parsed); } } return 0; } ``` ```markdown worked: 10 failed: foo worked: 42 ``` This is done pretty regularly [in Windows.](https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getwindowdisplayaffinity) ## 6. Return an enum and take an out param A boolean can only indicate that something succeeded or failed. If you want to know why something failed then substituting a boolean for an enum is a pretty natural mechanism. ```c #include <stdio.h> enum ParseNaturalNumberResult { PARSE_NATURAL_SUCCESS, PARSE_NATURAL_EMPTY_STRING, PARSE_NATURAL_BAD_CHARACTER }; enum ParseNaturalNumberResult parse_natural_base_10_number( const char* s, int* out ) { if (s[0] == '\0') { return PARSE_NATURAL_EMPTY_STRING; } int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { return PARSE_NATURAL_BAD_CHARACTER; } else { parsed *= 10; parsed += s[i] - '0'; } } *out = parsed; return PARSE_NATURAL_SUCCESS; } int main() { const char* examples[] = { "10", "foo", "42", "" }; for (size_t i = 0; i < 4; i++) { const char* example = examples[i]; int parsed; switch (parse_natural_base_10_number(example, &parsed)) { case PARSE_NATURAL_SUCCESS: printf("worked: %d\n", parsed); break; case PARSE_NATURAL_EMPTY_STRING: printf("failed because empty string\n"); break; case PARSE_NATURAL_BAD_CHARACTER: printf("failed because bad char: %s\n", example); break; } } return 0; } ``` ```markdown worked: 10 failed because bad char: foo worked: 42 failed because empty string ``` ## 7. Return a boolean and take two out params While an enum can give you the "category" of an error, it doesn't have a place for recording any more specific information than that. For example, a pretty reasonable thing to want to know if you run into an unexpected character is where in the string that character was found. By adding a second out param you can have a place to put this information. ```c #include <stdio.h> #include <stdbool.h> bool parse_natural_base_10_number( const char* s, int* out_value, size_t* out_bad_index ) { int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { *out_bad_index = i; return false; } else { parsed *= 10; parsed += s[i] - '0'; } } *out_value = parsed; return true; } int main() { const char* examples[] = { "10", "foo", "42", "12a34" }; for (size_t i = 0; i < 4; i++) { const char* example = examples[i]; int parsed; size_t bad_index; bool success = parse_natural_base_10_number( example, &parsed, &bad_index ); if (!success) { printf("failed: %s\n ", example); for (size_t j = 0; j < bad_index; j++) { printf(" "); } printf("^☹️\n"); } else { printf("worked: %d\n", parsed); } } return 0; } ``` ```markdown worked: 10 failed: foo ^☹️ worked: 42 failed: 12a34 ^☹️ ``` ## 8. Return an enum and multiple out params A natural extension of the previous two patterns is that if you have multiple ways in which a computation can fail, you can return an enum with each way and take an out param for each way that would require data. ```c #include <stdio.h> #include <string.h> enum ParseNaturalNumberResult { PARSE_NATURAL_SUCCESS, PARSE_NATURAL_EMPTY_STRING, PARSE_NATURAL_BAD_CHARACTER, PARSE_NUMBER_TOO_BIG }; struct BadCharacterInfo { size_t index; }; struct TooBigInfo { size_t remaining_characters; }; enum ParseNaturalNumberResult parse_natural_base_10_number( const char* s, int* out_value, struct BadCharacterInfo* bad_character_info, struct TooBigInfo* too_big_info ) { if (s[0] == '\0') { return PARSE_NATURAL_EMPTY_STRING; } int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { bad_character_info->index = i; return PARSE_NATURAL_BAD_CHARACTER; } else { int digit = s[i] - '0'; if (__builtin_smul_overflow(parsed, 10, &parsed) || __builtin_sadd_overflow(parsed, digit, &parsed)) { too_big_info->remaining_characters = strlen(s) - i; return PARSE_NUMBER_TOO_BIG; } } } *out_value = parsed; return PARSE_NATURAL_SUCCESS; } int main() { const char* examples[] = { "10", "foo", "42", "", "99999999999999" }; for (size_t i = 0; i < 5; i++) { const char* example = examples[i]; int parsed; struct BadCharacterInfo bad_character_info; struct TooBigInfo too_big_info; switch (parse_natural_base_10_number( example, &parsed, &bad_character_info, &too_big_info )) { case PARSE_NATURAL_SUCCESS: printf("worked: %d\n", parsed); break; case PARSE_NATURAL_EMPTY_STRING: printf("failed because empty string\n"); break; case PARSE_NATURAL_BAD_CHARACTER: printf( "failed because bad char at index %zu: %s\n", bad_character_info.index, example ); break; case PARSE_NUMBER_TOO_BIG: printf( "number was too big. had %zu digits left: %s\n", too_big_info.remaining_characters, example ); break; } } return 0; } ``` ```markdown worked: 10 failed because bad char at index 0: foo worked: 42 failed because empty string number was too big. had 5 digits left: 99999999999999 ``` ## 9. Set a thread local static value Another option is to, on an error, set a thread local static variable. This avoids needing to propagate an error explicitly all the way up the stack from where it occurs and makes the "normal" api of the function look as neat and clean as the ostrich or crash approaches. Once you set the thread local static value, either you 1. Return a predictable value indicating an issue (`NULL`, a negative number, etc) which hints to the programmer to check the thread local static value. 2. Return an uninitialized value and rely on the programmer to know that the value might be bogus unless they check the thread local static value ```c #include <stdio.h> #include <stdbool.h> _Thread_local static bool parse_number_error = false; int parse_natural_base_10_number(const char* s) { int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { parse_number_error = true; } else { parsed *= 10; parsed += s[i] - '0'; } } return parsed; } int main() { const char* examples[] = { "10", "42", "foo" }; for (size_t i = 0; i < 3; i++) { const char* example = examples[i]; int parsed = parse_natural_base_10_number(example); if (parse_number_error) { parse_number_error = false; printf("error: %s\n", example); } else { printf("parsed: %d\n", parsed); } } return 0; } ``` ```markdown parsed: 10 parsed: 42 error: foo ``` A good deal of built-in apis use a shared static constant int called [`errno`](https://en.cppreference.com/w/cpp/error/errno) and if they fail they will set it to a non-zero value. There are then functions like [`perror`](https://en.cppreference.com/w/c/io/perror) which can extract messages from the specific error code. You technically are allowed to use [`errno`](https://en.cppreference.com/w/cpp/error/errno) too as well as long as your error conditions can fit into its int encoding. This is my least favorite of the patterns. ## 10. Return a tagged union The next approach is what languages like [`Rust`](https://www.rust-lang.org/) emulate under the hood of their enums. You make a struct containing two things 1. A "tag". This should be a boolean or an enum depending on your tastes and the number of possibilities. 2. A union containing enough space for the data that should be associated with each "tag". Then you return this struct directly. The tag tells the caller which field of the union is safe to access and consequently what the "result" of the computation was. Compared to the out param solutions, where normally you would allocate each possible out param on the stack, this will compact the required space by way of the union. It also uses regular return values and checking the tag before checking the union is a relatively standard process. Unfortunately it will also make code more verbose than most of the other options. ```c #include <stdio.h> enum ParseNaturalNumberResultKind { PARSE_NATURAL_SUCCESS, PARSE_NATURAL_EMPTY_STRING, PARSE_NATURAL_BAD_CHARACTER }; struct BadCharacter { size_t index; char c; }; struct ParseNaturalNumberResult { enum ParseNaturalNumberResultKind kind; union { int success; struct BadCharacter bad_character; } data; }; struct ParseNaturalNumberResult parse_natural_base_10_number( const char* s ) { if (s[0] == '\0') { struct ParseNaturalNumberResult result = { .kind = PARSE_NATURAL_EMPTY_STRING }; return result; } int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { struct ParseNaturalNumberResult result = { .kind = PARSE_NATURAL_BAD_CHARACTER, .data = { .bad_character = { .index = i, .c = s[i] } } }; return result; } else { parsed *= 10; parsed += s[i] - '0'; } } struct ParseNaturalNumberResult result = { .kind = PARSE_NATURAL_SUCCESS, .data = { .success = parsed } }; return result; } int main() { const char* examples[] = { "10", "foo", "42", "12a34" }; for (size_t i = 0; i < 4; i++) { const char* example = examples[i]; struct ParseNaturalNumberResult result = parse_natural_base_10_number(example); switch (result.kind) { case PARSE_NATURAL_SUCCESS: printf("worked: %d\n", result.data.success); break; case PARSE_NATURAL_EMPTY_STRING: printf("got empty string"); break; case PARSE_NATURAL_BAD_CHARACTER: printf("failed: %s\n ", example); for (size_t j = 0; j < result.data.bad_character.index; j++) { printf(" "); } printf( "^☹️ '%c' is not good\n", result.data.bad_character.c ); break; } } return 0; } ``` ```markdown worked: 10 failed: foo ^☹️ 'f' is not good worked: 42 failed: 12a34 ^☹️ 'a' is not good ``` This is a very common pattern, especially when writing programs like language parsers where it is hard to avoid functions which can return one of many differently shaped possibilities. There are some examples [here in the curl codebase](https://github.com/curl/curl/blob/master/lib/ftplistparser.c) of using the general mechanism for the result of parsing. ## 11. Return a boxed "error object" The last one here is probably the toughest sell. It is more verbose than the other approaches, requires heap allocation, and requires a non-trivial degree of comfortableness in C. It does have its perks though. First, make a ["vtable"](https://en.wikipedia.org/wiki/Virtual_method_table#:~:text=A%20virtual%20method%20table%20(VMT,run%2Dtime%20method%20binding).). This will be a struct containing pointers to functions which take as their first argument a void pointer. For errors, lets say the things we will want to do are [produce an error message](https://go.dev/blog/error-handling-and-go) and dispose of any allocated resources afterward. ```c struct ErrorOps { char* (*describe)(const void*); void (*free)(void*); }; ``` Then make a struct which contains this [vtable](https://en.wikipedia.org/wiki/Virtual_method_table#:~:text=A%20virtual%20method%20table%20(VMT,run%2Dtime%20method%20binding).) as well as a pointer to the memory that is meant to be passed as the first argument to each function within. ```c struct Error { struct ErrorOps ops; void* self; }; ``` You can then make some helpers for doing the calling. ```c char* error_describe(struct Error error) { return error.ops.describe(error.self); } void error_free(struct Error error) { if (error.ops.free != NULL) { error.ops.free(error.self); } } ``` Then for each error condition, define how each operation should work as well as any helper functions and structs that you need. ```c char* empty_string_describe(const void* self) { char* result; asprintf(&result, "Empty string is not good"); return result; } const struct ErrorOps empty_string_error_ops = { .describe = empty_string_describe, .free = NULL }; struct Error empty_string_error() { struct Error result = { .ops = empty_string_error_ops, .self = NULL }; return result; } ``` ```c struct BadCharacterError { char* source; size_t index; }; char* bad_character_describe(const void* self) { const struct BadCharacterError* this = self; char* result; asprintf( &result, "Bad character in %s at index %zu: '%c'", this->source, this->index, this->source[this->index] ); return result; } void bad_character_free(void* self) { struct BadCharacterError* this = self; free(this->source); free(this); } const struct ErrorOps bad_character_error_ops = { .describe = bad_character_describe, .free = bad_character_free }; struct Error bad_character_error(const char* source, size_t index) { struct BadCharacterError* error = malloc(sizeof (struct BadCharacterError)); char* source_clone = calloc(strlen(source) + 1, sizeof (char)); strcpy(source_clone, source); error->source = source_clone; error->index = index; struct Error result = { .ops = bad_character_error_ops, .self = error }; return result; } ``` Then, by any of the previous schemes, return one of these error structs if something goes wrong. ```c struct ParseNaturalNumberResult { bool success; union { int success; struct Error error; } data; }; struct ParseNaturalNumberResult parse_natural_base_10_number( const char* s ) { if (s[0] == '\0') { struct ParseNaturalNumberResult result = { .success = false, .data = { .error = empty_string_error() } }; return result; } int parsed = 0; for (size_t i = 0; s[i] != '\0'; i++) { if (s[i] < '0' || s[i] > '9') { struct ParseNaturalNumberResult result = { .success = false, .data = { .error = bad_character_error(s, i) } }; return result; } else { parsed *= 10; parsed += s[i] - '0'; } } struct ParseNaturalNumberResult result = { .success = true, .data = { .success = parsed } }; return result; } int main() { const char* examples[] = { "10", "foo", "42", "12a34" }; for (size_t i = 0; i < 4; i++) { const char* example = examples[i]; struct ParseNaturalNumberResult result = parse_natural_base_10_number(example); if (!result.success) { char* description = error_describe(result.data.error); printf("error: %s\n", description); free(description); error_free(result.data.error); } else { printf("success: %d\n", result.data.success); } } return 0; } ``` ```markdown success: 10 error: Bad character in foo at index 0: 'f' success: 42 error: Bad character in 12a34 at index 2: 'a' ``` So... why do this? <a href="https://www.youtube.com/watch?v=idYERhOsw54"><img src="/pages/7-27-22-crystals.png" alt="Crystals!"></img></a> Because it is easy to compose this kind of error. Say we extended our problem such that we were reading a number from a file. Now the set of things that can go wrong includes all sorts of file reading related errors. It is a lot easier to include those errors if there is a way to treat them the "same" as the ones encountered during parsing. This accomplishes that. ```c struct FileOperationError { int error_number; }; char* file_operation_error_describe(const void* self) { const struct FileOperationError* this = self; char* result; asprintf(&result, "%s", strerror(this->error_number)); return result; } void file_operation_error_free(void* self) { free(self); } const struct ErrorOps file_operation_error_ops = { .describe = file_operation_error_describe, .free = file_operation_error_free }; struct Error file_operation_error(int error_number) { struct FileOperationError* file_operation_error = malloc(sizeof (struct FileOperationError)); file_operation_error->error_number = error_number; struct Error result = { .ops = file_operation_error_ops, .self = file_operation_error }; return result; } struct ReadNumberFromFileResult { bool success; union { int success; struct Error error; } data; }; struct ReadNumberFromFileResult read_number_from_file( const char* path ) { FILE* fp = fopen(path, "r"); if (fp == NULL) { struct ReadNumberFromFileResult result = { .success = false, .data = { .error = file_operation_error(errno) } }; errno = 0; fclose(fp); return result; } // Max positive int is only 10 characters big in base 10 char first_line[12]; fgets(first_line, sizeof (first_line), fp); if (ferror(fp)) { struct ReadNumberFromFileResult result = { .success = false, .data = { .error = file_operation_error(errno) } }; errno = 0; fclose(fp); return result; } struct ParseNaturalNumberResult parse_result = parse_natural_base_10_number(first_line); if (!parse_result.success) { struct ReadNumberFromFileResult result = { .success = false, .data = { .error = parse_result.data.error } }; fclose(fp); return result; } struct ReadNumberFromFileResult result = { .success = true, .data = { .success = parse_result.data.success } }; fclose(fp); return result; } int main() { const char* examples[] = { "../ex1", "../ex2", "../ex3" }; for (size_t i = 0; i < 3; i++) { const char* example_file = examples[i]; struct ReadNumberFromFileResult result = read_number_from_file(example_file); if (!result.success) { char* description = error_describe(result.data.error); printf("error: %s\n", description); free(description); error_free(result.data.error); } else { printf("success: %d\n", result.data.success); } } return 0; } ``` ```markdown success: 8 error: Bad character in abc at index 0: 'a' error: No such file or directory ``` This can all be done with tagged unions as well, so it is a judgement call. This sort of pattern definitely has more appeal when the language being used makes it convenient. --- Important to note that I am not a professional C programmer. I fully expect to be shown the error of my ways in the comments below.Wed, 27 Jul 0022 05:00:00 +0000Publish a Java library to Maven Central without Maven or Gradlehttps://mccue.dev/pages/6-1-22-upload-to-maven-central Say, like me, you have some code you want to share with the world. ```java package dev.mccue.datastructures; /** * "Sum Type" representation of a linked list. */ public sealed interface LinkedList<T> { /** * An empty list. */ record Empty<T>() implements LinkedList<T> {} /** * A not empty list. */ record NotEmpty<T>(T first, LinkedList<T> rest) implements LinkedList<T> {} } ``` To do this, you need to put that code in a place others can find it. For Python programmers this means publishing to [PyPI](https://pypi.org/), Javascript programmers to [npm](https://www.npmjs.com/), Rust programmers to [crates.io](https://crates.io/), and C++ programmers to somewhere I assume. For Java there are a few options, but the only one that will work by default in every build tool is [Maven Central](https://search.maven.org/). Its apparently really good at [being a repository](https://github.com/bowbahdoe/magic-bean/issues/8#issuecomment-1023872969), so publishing there is the thing to do. There are plugins for all the major build tools that do this. However, last I tried, [uploading a Java 16+ library to Maven Central using Maven was busted](https://issues.sonatype.org/browse/OSSRH-66257) and [requires exposing Java internals to work around](https://github.com/bowbahdoe/magic-bean/blob/main/.github/workflows/publish.yml#L27). So we are going to do something a little different. I am going to show you how to go through the entire process manually in the hope that it is straightforward enough to write your own scripts to do. ## Prerequisites to follow along ### Java ```markdown javac --version jar --version javadoc --version ``` ### gpg ```markdown gpg --version ``` ### curl ```markdown curl --version ``` ### git ```markdown git --version ``` ### Github CLI ```markdown gh --version ``` ## Step 1. Write your code For this example I am going to put the linked list code from the top of the page in a file `src/dev/mccue/datastructures/LinkedList.java` and make a small `.gitignore`. ```markdown target/ .idea/ *.iml .DS_Store ``` ## Step 2. Add your code to a git repo ```markdown git init git add src/ git add .gitignore git commit -m "Initial Commit" ``` ## Step 3. Put that git repo on the internet You will need a public url to refer to later and services like Github are convenient for that. ```markdown gh auth login gh repo create --public --source . git branch -M main git push origin main ``` ## Step 4. Get unique coordinates Unlike other package repositories, Maven Central requires that you have a unique "group id" to prefix any packages you make. You cannot publish code under `com.google`, only Google can. To meet this requirement you either need to * Buy a domain name. You can do this through a lot of websites. I personally use [namecheap](https://www.namecheap.com/), but there are quite a few options. Once you do this you can publish code under `com.yoursite`. * Make an account on one of the git hosting services. This is the easiest way, but you will only be able to publish under `io.github.yourusername` or similar. ## Step 5. Make an account with Sonatype Once you got that all settled 1. Make an account [here](https://issues.sonatype.org/secure/Signup!default.jspa). Save the username and password. 2. Make a ticket [here](https://issues.sonatype.org/secure/CreateIssue.jspa?pid=10134&issuetype=21). You will need to prove that you own the website or git account that you want to use for your group id. This is an annoying step, I know, but it is what it is. If you get caught here ask in the comments below and I'll add more clarification. ## Step 6. Compile your code ```markdown javac -d target/classes -g --release 17 src/**/*.java ``` The `-g` includes debug information. Always do that. ## Step 7. Generate documentation for your code ```markdown javadoc -d target/doc src/**/*.java ``` If you get warnings about undocumented classes and methods ignoring them is a choice you are technically allowed to make. ## Step 8. Decide on a version number When you publish code there is the implicit assumption that you might upload newer versions of that code at a later point in time. To distinguish between versions, you need to number them. There are a few schemes for doing this including [Semver](https://semver.org/), [Calver](https://calver.org/), and [0ver](https://0ver.org/). In the commands from this point on, I am going to assume that the initial version being published is `0.0.1`, but you can do what you feel is best. ## Step 9. Zip your compiled code into a jar As early minecraft players learned when installing mods, Jar files are just zip files with a few extra bells and whistles. ```markdown mkdir target/deploy jar --create \ --file target/deploy/datastructures-0.0.1.jar \ -C target/classes . ``` ## Step 10. Zip your source code into a jar ```markdown jar --create \ --file target/deploy/datastructures-0.0.1-sources.jar \ -C src . ``` ## Step 11. Zip your documentation into a jar ```markdown jar --create \ --file target/deploy/datastructures-0.0.1-javadoc.jar \ -C target/doc . ``` ## Step 12. Create a POM File A POM - "Project Object Model" - file is the standard format for declaring information about your library including any dependencies it may have on other libraries. This format is [going to be around forever](https://www.javaadvent.com/2021/12/from-maven-3-to-maven-5.html) and all build tools have to handle it. The following I am going to put into `target/deploy/datastructures-0.0.1.pom`. This is the "minimal" POM and [every field I list needs to be specified](https://central.sonatype.org/publish/requirements/#project-name-description-and-url). ```xml <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>dev.mccue</groupId> <artifactId>datastructures</artifactId> <version>0.0.1</version> <packaging>jar</packaging> <name>Datastructures</name> <description>Basic Datastructures for Java.</description> <url>https://github.com/bowbahdoe/java-datastructures</url> <licenses> <license> <name>The Apache Software License, Version 2.0</name> <url>http://www.apache.org/licenses/LICENSE-2.0.txt</url> </license> </licenses> <developers> <developer> <name>Ethan McCue</name> <email>ethan@mccue.dev</email> <organization>McCue Software Solutions</organization> <organizationUrl>https://www.mccue.dev</organizationUrl> </developer> </developers> <scm> <connection>scm:git:git://github.com/bowbahdoe/java-datastructures.git</connection> <developerConnection>scm:git:ssh://github.com:bowbahdoe/java-datastructures.git</developerConnection> <url>https://github.com/bowbahdoe/java-datastructures/tree/main</url> </scm> </project> ``` ## Step 13. Create a GPG Key Okay so this part might feel wierd. The idea here was that you generate a public and private key. You sign all the files you upload with those keys and then later on someone can confirm that it was "you" that actually did that signing. Maven Central just makes sure that everything is signed, not that there is any way to associate the signed files back to you. Because public key infrastructure never really took off, this step is largely ceremonial in practice. You still need to do it though. [The official guide is more comprehensive than I am going to be](https://central.sonatype.org/publish/requirements/gpg/#generating-a-key-pair) ```markdown gpg --gen-key ``` Make sure to save your passphrase if you made one. ## Step 14. Distribute your GPG Key Run this command ```markdown gpg --list-keys ``` And you should get output that kinda looks like this. ```markdown pub rsa3072 2021-06-23 [SC] [expires: 2023-06-23] CA925CD6C9E8D064FF05B4728190C4130ABA0F98 uid [ultimate] Central Repo Test <central@example.com> sub rsa3072 2021-06-23 [E] [expires: 2023-06-23] ``` You want to take the part that looks like `CA925CD6C9E8D064FF05B4728190C4130ABA0F98` and run the following command. ```markdown gpg --keyserver keyserver.ubuntu.com \ --send-keys CA925CD6C9E8D064FF05B4728190C4130ABA0F98 ``` ## Step 15. Sign all the files with GPG ```markdown gpg --armor --detach-sign target/deploy/datastructures-0.0.1.jar gpg --armor --detach-sign target/deploy/datastructures-0.0.1-sources.jar gpg --armor --detach-sign target/deploy/datastructures-0.0.1-javadoc.jar gpg --armor --detach-sign target/deploy/datastructures-0.0.1.pom ``` If you are scripting this you should add `--pinentry-mode loopback` and provide your passphrase via `--passphrase`. ## Step 16. Zip all the jars into one large jar Yes, we are making a jar jar. <img src="/pages/6-1-22-jarjar.webp" alt="Jar Jar Binks"></img> The most convenient api for uploading code manually is a [form submit on the gui](https://central.sonatype.org/publish/publish-manual/) that is undocumented. I wanted to use something more official, but I had trouble [finding what to do](https://grep.app/search?q=ProjectDeployerRequest). I think its probably fine. Said api wants one large jar as its input. ```markdown jar --create --file target/bundle.jar -C target/deploy . ``` ## Step 17. Log in to sonatype Use the username and password you got from step 5. ```markdown curl --request GET \ --url https://s01.oss.sonatype.org/service/local/authentication/login \ --cookie-jar cookies.txt \ --user USERNAME:PASSWORD ``` ## Step 18. Upload the bundle to a staging repository ```markdown curl --request POST \ --url https://s01.oss.sonatype.org/service/local/staging/bundle_upload \ --cookie cookies.txt \ --header 'Content-Type: multipart/form-data' \ --form file=@target/bundle.jar ``` When you run this command, you will get output back that looks like this ```json {"repositoryUris":["https://s01.oss.sonatype.org/content/repositories/STAGING_REPOSITORY_ID"]} ``` At this point, you can pause and point a build tool to the staging repository to make sure that everything is okay with your code before releasing the final version. ## Step 19. Release the staging repository Fill in the `STAGING_REPOSITORY_ID` from the output of the last command. There is no going back once the staging repostory is released. ```markdown curl --request POST \ --url https://s01.oss.sonatype.org/service/local/staging/bulk/promote \ --cookie cookies.txt \ --header 'Content-Type: application/json' \ --data '{ "data": { "autoDropAfterRelease": true, "description": "", "stagedRepositoryIds": ["STAGING_REPOSITORY_ID"] } }' ``` ---- You can try out the linked list we just published by including it in your build tool of choice. ```xml <dependency> <groupId>dev.mccue</groupId> <artifactId>datastructures</artifactId> <version>0.0.1</version> </dependency> ``` A fully scripted version of this process can be seen [here](https://github.com/bowbahdoe/java-async-utils/blob/main/build/Build.java) along with [an associated Github workflow](https://github.com/bowbahdoe/java-async-utils/blob/main/.github/workflows/publish.yml) Explain what a Maven MOJO is in 140 characters or less in the comments below.Wed, 01 Jun 0022 05:00:00 +0000Why is it that byteArrMap.remove is returning falsehttps://mccue.dev/pages/5-14-22-byte-arr-map-returning-false ## Question from theuntamed000#1481 > noob question why is that byteArrMap.remove returning false > > <img src="/pages/5-14-22-shell-1.png" alt="JShell Session Screenshot 1" ></img> > <img src="/pages/5-14-22-shell-2.png" alt="JShell Session Screenshot 2" ></img> > > also does java use something like Integer.valueOf() while doing boxing aight so first one when you call remove with a byte array those byte arrays are not equal ```java jshell> byte[] b1 = new byte[0]; b1 ==> byte[0] { } jshell> byte[] b2 = new byte[0]; b2 ==> byte[0] { } jshell> b1 == b2 $3 ==> false jshell> b1.equals(b2) $4 ==> false ``` even though they have the same contents > just read the docs , it uses Object.equals which uses references So it won't find a matching key, thus it won't remove anything returning false. Second, `new Integer(123)` and `Integer.valueOf(123`) might seem identical, but `new Integer` is a constructor call and all constructor calls need to return distinct objects. Static methods don't have that restriction, so you can implement some degree of caching behind them. Which, looking at the implementation of `Integer.valueOf` is what is being done. Small numbers' Integer representations are cached. ```java @IntrinsicCandidate public static Integer valueOf(int i) { return i >= -128 && i <= Integer.IntegerCache.high ? Integer.IntegerCache.cache[i + 128] : new Integer(i); } ``` But the exact strategy matters less than the fact that choosing to implement a strategy is possible with the "static factory." Even if the implementation was just ```java @IntrinsicCandidate public static Integer valueOf(int i) { return new Integer(i); } ``` there would be a value in it from a library design standpoint. And that is why this deprecated for removal makes sense. Removing the ability for libraries to call the constructor directly means the jdk would simply have more options for optimizations. ```java @Deprecated( since = "9", forRemoval = true ) public Integer(int value) { this.value = value; } ``` And when the language does autoboxing - yes it uses valueOf ```shell $ cat Box.java class Box { Integer f() { Integer i = 4; return i; } } $ javac Box.java $ javap -v Box.class ``` ``` Classfile /Users/emccue/Development/micro-http-ring/Box.class Last modified May 14, 2022; size 325 bytes SHA-256 checksum 78f01c27cb6b16a51a1c0ac47bf1ceb94cc5a4a7672afc615eb634dd948ba138 Compiled from "Box.java" class Box minor version: 0 major version: 61 flags: (0x0020) ACC_SUPER this_class: #13 // Box super_class: #2 // java/lang/Object interfaces: 0, fields: 0, methods: 2, attributes: 1 Constant pool: #1 = Methodref #2.#3 // java/lang/Object."<init>":()V #2 = Class #4 // java/lang/Object #3 = NameAndType #5:#6 // "<init>":()V #4 = Utf8 java/lang/Object #5 = Utf8 <init> #6 = Utf8 ()V #7 = Methodref #8.#9 // java/lang/Integer.valueOf:(I)Ljava/lang/Integer; #8 = Class #10 // java/lang/Integer #9 = NameAndType #11:#12 // valueOf:(I)Ljava/lang/Integer; #10 = Utf8 java/lang/Integer #11 = Utf8 valueOf #12 = Utf8 (I)Ljava/lang/Integer; #13 = Class #14 // Box #14 = Utf8 Box #15 = Utf8 Code #16 = Utf8 LineNumberTable #17 = Utf8 f #18 = Utf8 ()Ljava/lang/Integer; #19 = Utf8 SourceFile #20 = Utf8 Box.java { Box(); descriptor: ()V flags: (0x0000) Code: stack=1, locals=1, args_size=1 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return LineNumberTable: line 1: 0 java.lang.Integer f(); descriptor: ()Ljava/lang/Integer; flags: (0x0000) Code: stack=1, locals=2, args_size=1 0: iconst_4 1: invokestatic #7 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 4: astore_1 5: aload_1 6: areturn LineNumberTable: line 3: 0 line 4: 5 } SourceFile: "Box.java" ``` a bit verbose, but you see the `invokestatic` call in the last snippet refers to `Integer.valueOf` > yeah ``` $ javap -c Box.class Compiled from "Box.java" class Box { Box(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return java.lang.Integer f(); Code: 0: iconst_4 1: invokestatic #7 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 4: astore_1 5: aload_1 6: areturn } ``` easier to see with `-c` (still learning javap) > can there be case where the new Integer(i) is called yes, if the integer is outside of the range -128 to 128 it will be outside of the cache and new Integer will be used. _for this implementation_ > if it can then Objects.equals() will return false , and it would behave like the byte[] example > > but i guess that never happens not quite the implementation of equals for Integer actually compares the value ```java jshell> Integer i1 = Integer.valueOf(123456); i1 ==> 123456 jshell> Integer i2 = Integer.valueOf(123456); i2 ==> 123456 jshell> i1 == i2 $7 ==> false jshell> i1.equals(i2) $8 ==> true ``` ```java jshell> byte[] b1 = new byte[0]; b1 ==> byte[0] { } jshell> byte[] b2 = new byte[0]; b2 ==> byte[0] { } jshell> b1 == b2 $3 ==> false jshell> b1.equals(b2) $4 ==> false ``` reference equality `i1 == i2` will return false if you have distinct `Integer` objects > <img src="/pages/5-14-22-docs.png" alt="Screenshot of docs page for Map.remove" ></img> But the `equals` method for `Integer` inherited from `Object` is overridden so that comparing them with `.equals` or `java.util.Objects.equals(` will give the answer you would usually expect. Yep so if you tried the `byte[]` map example with `Integer` then `.remove `would always find its target value and return `true` even if you used the deprecated `new Integer` directly or were outside of the cache range for `Integer.valueOf`. ```markdown BYTE ARRAYS | b1 == b2 | b1.equals(b2) | Objects.equals(b1, b2) | Arrays.equals(b1, b2) ----------------|-------------------------------------------------------------------------- | SAME OBJECT | true | true | true | true ----------------|-------------------------------------------------------------------------- | SAME CONTENTS | false | false | false | true ----------------|-------------------------------------------------------------------------- | DIFF CONTENTS | false | false | false | false ----------------|-------------------------------------------------------------------------- | b1 is null | false | crash | false | false ----------------|-------------------------------------------------------------------------- | b2 is null | false | false | false | false ----------------|-------------------------------------------------------------------------- | both are null | true | crash | true | true ``` > internal implementation does not use Objects.equals but instead does value.equals(provided) > > so yeah it overrides that method ```markdown Integer | i1 == i2 | b1.equals(b2) | Objects.equals(b1, b2) | ----------------|---------------------------------------------------- | SAME OBJECT | true | true | true | ----------------|---------------------------------------------------- | SAME VALUE | false | true | true | ----------------|---------------------------------------------------- | DIFF VALUE | false | false | false | ----------------|---------------------------------------------------- | i1 is null | false | crash | false | ----------------|---------------------------------------------------- | i2 is null | false | false | false | ----------------|---------------------------------------------------- | both are null | true | crash | true | ``` > hey thanks man, that was quite detailed Sat, 14 May 0022 05:00:00 +0000Go's Concurrency Examples in Java 19https://mccue.dev/pages/5-2-22-go-concurrency-in-java ## Preface Threads are usually expensive. There is no way for your operating system to know exactly how much stack space a thread will need so it allocates an amount on the order of around a kilobyte initially and then around a megabyte once the thread starts to be used. You only have around a bakers dozen gigabytes of RAM, so you can only have give or take 10,000 active threads. The way around this is to implement a some mechanism that takes a limited number of operating system threads and juggles a much larger number of "logical threads" on top of them. For most languages, this means adding some form of `async/await` syntax. Where you put an `await` the language knows it can switch to handling another task. You can only put `await`s inside of code marked `async`. This [has problems](https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/). The Go programming language is different than most in that it implemented this juggling "non-cooperatively". You don't explicitly mark your code with `async` and `await`, the runtime slices it up for you. They call these cheap threads "goroutines." The Java Virtual Machine is going to get an analogous feature called "Virtual Threads." This won't just benefit Java, but every language on the JVM including Clojure, Groovy, Kotlin, and Scala. Virtual Threads are slated to appear as a "Preview" feature in Java 19 on September 20, 2022. This means that the implementation of the underlying feature is complete and tested, but the public API is subject to breaking changes and must be opted into explicitly. Many of Go's patterns around concurrency arise from the conceit that you can create threads with abandon. Since Java is about to join that club, it seems a good time to go through some of the Go concurrency examples and see what they might look like translated over. If you want to follow along, you can get an early access build [here](https://jdk.java.net/loom/). Unzip the files and add the `bin/` directory to your path. All the examples can be followed in sequence by using [`jshell`](https://docs.oracle.com/en/java/javase/18/docs/specs/man/jshell.html). ```bash $ java --version openjdk 19-loom 2022-09-20 OpenJDK Runtime Environment (build 19-loom+6-625) OpenJDK 64-Bit Server VM (build 19-loom+6-625, mixed mode, sharing) $ jshell --enable-preview --add-modules=jdk.incubator.concurrent ``` ## Example 1. Goroutines https://go.dev/tour/concurrency/1 ```go package main import ( "fmt" "time" ) func say(s string) { for i := 0; i < 5; i++ { time.Sleep(100 * time.Millisecond) fmt.Println(s) } } func main() { go say("world") say("hello") } ``` This is a pretty classic example, and frankly can be done with operating system threads just as well. ```java import java.time.Duration; import java.util.concurrent.Executors; public final class VirtualThreads { private VirtualThreads() {} static void say(String s) { try { for (int i = 0; i < 5; i++) { Thread.sleep(Duration.ofMillis(100)); System.out.println(s); } } catch (InterruptedException e) { throw new RuntimeException(e); } } public static void main(String[] args) { try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { executor.submit(() -> say("world")); say("hello"); } } } ``` ```java VirtualThreads.main(new String[]{}); ``` A few key things to notice. 1. There is some noise in the `say` method around handling what will happen if the thread is interrupted. In this case we just choose to throw a [`RuntimeException`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/lang/RuntimeException.html) to indicate we just want to crash. In Go there is less noise, but also there no way to interrupt Go's [`time.Sleep`](https://pkg.go.dev/time#Sleep). It is also an option to propagate the [`InterruptedException`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/lang/InterruptedException.html) up if we add a `return null;` to target the [`Callable`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/Callable.html) overload. ```java public final class VirtualThreads { private VirtualThreads() {} static void say(String s) throws InterruptedException { for (int i = 0; i < 5; i++) { Thread.sleep(Duration.ofMillis(100)); System.out.println(s); } } public static void main(String[] args) throws InterruptedException { try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { executor.submit(() -> { say("world"); return null; }); say("hello"); } } } ``` 2. You need more than `go say("world")` `Executors.newVirtualThreadPerTaskExecutor()` creates an [`ExecutorService`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/ExecutorService.html). This is a thing which you can submit tasks to and it will run them "somehow". Today most [`ExecutorService`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/ExecutorService.html)s are backed by some pool of threads. The purpose of the interface is to be able to write code without needing to know about the underlying strategy for maintaining that pool. Virtual Threads are cheap, so you don't need to pool them. The interface still serves a use though. [`ExecutorService`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/ExecutorService.html)s will extend [`AutoClosable`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/lang/AutoCloseable.html), so when used with the "try-with-resources" syntax you can make a block of code where you wait until all tasks have completed before moving on. If you wanted to do the same creating threads directly it would look like this. ```java public final class VirtualThreads { private VirtualThreads() {} static void say(String s) { try { for (int i = 0; i < 5; i++) { Thread.sleep(Duration.ofMillis(100)); System.out.println(s); } } catch (InterruptedException e) { throw new RuntimeException(e); } } public static void main(String[] args) throws InterruptedException { var worldThread = Thread.startVirtualThread( () -> say("world") ); say("hello"); // Explicitly join to wait for the other thread. worldThread.join(); } } ``` ## Example 2. Channels https://go.dev/tour/concurrency/2 ```go package main import "fmt" func sum(s []int, c chan int) { sum := 0 for _, v := range s { sum += v } c <- sum // send sum to c } func main() { s := []int{7, 2, 8, -9, 4, 0} c := make(chan int) go sum(s[:len(s)/2], c) go sum(s[len(s)/2:], c) x, y := <-c, <-c // receive from c fmt.Println(x, y, x+y) } ``` Go has the concept of a "channel." This is a lightweight pipe along which values can be sent between ["Communicating Sequential Processes"](https://www.cs.cmu.edu/~crary/819-f09/Hoare78.pdf). Java does not have this concept in its standard library. There are [similar constructs](https://clojure.github.io/core.async/#clojure.core.async/chan) in libraries and it [may come in the future](https://cr.openjdk.java.net/~rpressler/loom/loom/sol1_part2.html#channels), but for now no dice. A somewhat close analogue is a [`BlockingQueue`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/BlockingQueue.html), so that is what I am going to use for the purposes of these examples. ```java import java.util.concurrent.ArrayBlockingQueue; import java.util.concurrent.BlockingQueue; import java.util.concurrent.Executors; public final class Queues { private Queues() {} static void sum( int[] s, int start, int end, BlockingQueue<Integer> queue ) throws InterruptedException { int sum = 0; for (int i = start; i < end; i++) { sum += s[i]; } queue.put(sum); } public static void main(String[] args) throws InterruptedException { int[] s = { 7, 2, 8, -9, 4, 0 }; try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { var queue = new ArrayBlockingQueue<Integer>(1); executor.submit(() -> { sum(s, 0, s.length / 2, queue); return null; }); executor.submit(() -> { sum(s, s.length / 2, s.length, queue); return null; }); int x = queue.take(); int y = queue.take(); System.out.printf("%d %d %d\n", x, y, x + y); } } } ``` ```java Queues.main(new String[]{}); ``` Instead of Go's syntax for making slices of arrays, I opted to instead pass the indexes that each `sum` call was expected to work on. It is only safe to share the memory for the array like this because no other threads are changing its contents. If there was we would have summoned Gorslax. This would be true in both Go and Java. In both cases the way this works is each worker sends the results of its computation to a logical queue. Once we have read two values off the shared queue we implicitly know that the two tasks we started have finished. For "one shot" use cases such as this, you could also use Java's [`CompletableFuture`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/CompletableFuture.html) for the same purpose. ```java import java.util.concurrent.CompletableFuture; import java.util.concurrent.ExecutionException; import java.util.concurrent.Executors; public final class Queues { private Queues() {} static void sum( int[] s, int start, int end, CompletableFuture<Integer> future ) { int sum = 0; for (int i = start; i < end; i++) { sum += s[i]; } future.complete(sum); } public static void main(String[] args) throws InterruptedException, ExecutionException { int[] s = { 7, 2, 8, -9, 4, 0 }; try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { var futureOne = new CompletableFuture<Integer>(); var futureTwo = new CompletableFuture<Integer>(); executor.submit(() -> { sum(s, 0, s.length / 2, futureOne); return null; }); executor.submit(() -> { sum(s, s.length / 2, s.length, futureTwo); return null; }); int x = futureOne.get(); int y = futureTwo.get(); System.out.printf("%d %d %d\n", x, y, x + y); } } } ``` This adds [`ExecutionException`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/ExecutionException.html) to the explicit list of things that can go wrong, but is a more direct api for a task that will run and produce one value as a result. In fact, if we were to change `sum` to return its result directly then we could eliminate its awareness that it is being run asynchronously. ```java import java.util.concurrent.CompletableFuture; import java.util.concurrent.ExecutionException; import java.util.concurrent.Executors; public final class Queues { private Queues() {} static int sum(int[] s, int start, int end) { int sum = 0; for (int i = start; i < end; i++) { sum += s[i]; } return sum; } public static void main(String[] args) throws InterruptedException, ExecutionException { int[] s = { 7, 2, 8, -9, 4, 0 }; try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { var futureOne = CompletableFuture .supplyAsync( () -> sum(s, 0, s.length / 2), executor ); var futureTwo = CompletableFuture .supplyAsync( () -> sum(s, s.length / 2, s.length), executor ); int x = futureOne.get(); int y = futureTwo.get(); System.out.printf("%d %d %d\n", x, y, x + y); } } } ``` And if we don't need any of the fancier capabilities of [`CompletableFuture`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/CompletableFuture.html), then the plain [`Future`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/Future.html) objects returned by submitting directly to the [`ExecutorService`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/ExecutorService.html) are also an option. ```java import java.util.concurrent.ExecutionException; import java.util.concurrent.Executors; public final class Queues { private Queues() {} static int sum(int[] s, int start, int end) { int sum = 0; for (int i = start; i < end; i++) { sum += s[i]; } return sum; } public static void main(String[] args) throws InterruptedException, ExecutionException { int[] s = { 7, 2, 8, -9, 4, 0 }; try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { var futureOne = executor.submit( () -> sum(s, 0, s.length / 2) ); var futureTwo = executor.submit( () -> sum(s, s.length / 2, s.length) ); int x = futureOne.get(); int y = futureTwo.get(); System.out.printf("%d %d %d\n", x, y, x + y); } } } ``` ## Example 3. Buffered Channels https://go.dev/tour/concurrency/3 ```go package main import "fmt" func main() { ch := make(chan int, 2) ch <- 1 ch <- 2 fmt.Println(<-ch) fmt.Println(<-ch) } ``` There isn't much to this one. Go's channels can be "buffered", meaning they can accept multiple values before they will be "full". If a channel is full then any thread that wants to put a value onto that channel will have to wait until another thread takes a value off. The [`ArrayBlockingQueue`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/ArrayBlockingQueue.html) class we've been using works the same way. ```java import java.util.concurrent.ArrayBlockingQueue; public final class BufferedQueue { private BufferedQueue() {} public static void main(String[] args) throws InterruptedException { var queue = new ArrayBlockingQueue<Integer>(2); queue.put(1); queue.put(2); System.out.println(queue.take()); System.out.println(queue.take()); } } ``` ```java BufferedQueue.main(new String[]{}); ``` ## Example 4. Range and Close https://go.dev/tour/concurrency/4 ```go package main import ( "fmt" ) func fibonacci(n int, c chan int) { x, y := 0, 1 for i := 0; i < n; i++ { c <- x x, y = y, x+y } close(c) } func main() { c := make(chan int, 10) go fibonacci(cap(c), c) for i := range c { fmt.Println(i) } } ``` Here is where the differences between a Java [`BlockingQueue`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/BlockingQueue.html) and a Go `chan` start to manifest themselves. There is no ability to "close" a [`BlockingQueue`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/BlockingQueue.html). One way around this is to send a special "sentinel" value over the queue to indicate that a reader should stop reading. This only works cleanly when we have a single reader though. There is also no equivalent to the `range` operator. We need to write a normal `while` loop. ```java import java.util.concurrent.ArrayBlockingQueue; import java.util.concurrent.BlockingQueue; import java.util.concurrent.Executors; sealed interface TakeResult<T> { record GotValue<T>(T value) implements TakeResult<T> {} record NoValue<T>() implements TakeResult<T> {} } public final class Fibonacci { private Fibonacci() {} static void fibonacci( int n, BlockingQueue<TakeResult<Integer>> queue ) throws InterruptedException { int x = 0; int y = 1; for (int i = 0; i < n; i++) { queue.put(new TakeResult.GotValue<>(x)); int temp = x; x = y; y = temp + x; } queue.put(new TakeResult.NoValue<>()); } public static void main(String[] args) throws InterruptedException { try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { var queue = new ArrayBlockingQueue<TakeResult<Integer>>(10); executor.submit(() -> { fibonacci(queue.remainingCapacity(), queue); return null; }); while (queue.take() instanceof GotValue<Integer> gotValue) { System.out.println(gotValue.value()); } } } } ``` ```java Fibonacci.main(new String[]{}); ``` This snippet makes use of sealed interfaces, a relatively recent addition to Java, for modeling getting either a legitimate value over the queue or a signal to stop consuming. The other options for the same result would be to drop the generic types from the [`BlockingQueue`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/BlockingQueue.html) and use a special sentinel instance of `Object` or disallow `null` values for normal use and have that indicate that the queue is closed. ## Example 5. Select https://go.dev/tour/concurrency/5 ```go package main import "fmt" func fibonacci(c, quit chan int) { x, y := 0, 1 for { select { case c <- x: x, y = y, x+y case <-quit: fmt.Println("quit") return } } } func main() { c := make(chan int) quit := make(chan int) go func() { for i := 0; i < 10; i++ { fmt.Println(<-c) } quit <- 0 }() fibonacci(c, quit) } ``` There is also no equivalent to `select` for [`BlockingQueue`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/BlockingQueue.html)s. We have to implement that logic in a hand rolled loop. ```java import java.util.concurrent.ArrayBlockingQueue; import java.util.concurrent.BlockingQueue; import java.util.concurrent.Executors; public final class SelectQueues { private SelectQueues() {} static void fibonacci(BlockingQueue<Integer> queue, BlockingQueue<Integer> quit) { int x = 0; int y = 1; while (true) { if (queue.offer(x)) { int temp = x; x = y; y = temp + x; } if (quit.poll() != null) { System.out.println("quit"); break; } } } public static void main(String[] args) { var queue = new ArrayBlockingQueue<Integer>(1); var quit = new ArrayBlockingQueue<Integer>(1); try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { executor.submit(() -> { for (int i = 0; i < 10; i++) { System.out.println(queue.take()); } quit.put(0); return null; }); fibonacci(queue, quit); } } } ``` ```java SelectQueues.main(new String[]{}); ``` I'm unsure for what purpose the Go version uses a channel of integers as its quit mechanism. In Java it is more natural to use something like a shared [`AtomicBoolean`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/atomic/AtomicBoolean.html) as a signal for shutdowwn. ```java import java.util.concurrent.ArrayBlockingQueue; import java.util.concurrent.BlockingQueue; import java.util.concurrent.Executors; import java.util.concurrent.atomic.AtomicBoolean; public final class SelectQueues { private SelectQueues() {} static void fibonacci(BlockingQueue<Integer> queue, AtomicBoolean quit) { int x = 0; int y = 1; while (!quit.get()) { if (queue.offer(x)) { int temp = x; x = y; y = temp + x; } } System.out.println("quit"); } public static void main(String[] args) throws InterruptedException { var queue = new ArrayBlockingQueue<Integer>(1); var quit = new AtomicBoolean(false); try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { executor.submit(() -> { for (int i = 0; i < 10; i++) { System.out.println(queue.take()); } quit.set(true); return null; }); fibonacci(queue, quit); } } } ``` If it were a situation with multiple "one shot" queues then [`CompletableFuture#anyOf`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/CompletableFuture.html#anyOf(java.util.concurrent.CompletableFuture...)) and similar methods might suffice. ## Example 6. Default Selection https://go.dev/tour/concurrency/6 ```go package main import ( "fmt" "time" ) func main() { tick := time.Tick(100 * time.Millisecond) boom := time.After(500 * time.Millisecond) for { select { case <-tick: fmt.Println("tick.") case <-boom: fmt.Println("BOOM!") return default: fmt.Println(" .") time.Sleep(50 * time.Millisecond) } } } ``` There isn't a novel transformation of this default case syntax, but it is worth noting how Go's time library directly returns its channels as the mechanism for handling delays. ```java import java.time.Duration; import java.time.Instant; import java.util.concurrent.ArrayBlockingQueue; import java.util.concurrent.Executors; public final class GreenThreadDress { private GreenThreadDress() {} public static void main(String[] args) throws InterruptedException { var executor = Executors.newVirtualThreadPerTaskExecutor(); try { var tick = new ArrayBlockingQueue<Instant>(1); var boom = new ArrayBlockingQueue<Instant>(1); executor.submit(() -> { while (true) { Thread.sleep(Duration.ofMillis(100)); tick.put(Instant.now()); } }); executor.submit(() -> { Thread.sleep(500); boom.put(Instant.now()); return null; }); while (true) { if (tick.poll() != null) { System.out.println("tick."); } else if (boom.poll() != null) { System.out.println("BOOM!"); break; } else { System.out.println(" ."); Thread.sleep(Duration.ofMillis(50)); } } } finally { executor.shutdownNow(); executor.close(); } } } ``` ```java GreenThreadDress.main(new String[]{}); ``` Here we rely on the behavior of [`ExecutorService#shutdownNow`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/ExecutorService.html#shutdownNow()) to interrupt the task pushing to the `tick` queue. Unlike with the built in Go [`time.Tick`](https://pkg.go.dev/time#Tick) where the underlying goroutine is never cancelled and is a "leak." ## Example 7: Equivalent Binary Trees https://go.dev/tour/concurrency/7 https://go.dev/tour/concurrency/8 ```go package main import "golang.org/x/tour/tree" // Walk walks the tree t sending all values // from the tree to the channel ch. func Walk(t *tree.Tree, ch chan int) // Same determines whether the trees // t1 and t2 contain the same values. func Same(t1, t2 *tree.Tree) bool func main() { } ``` This one is a little bit different since its not a straight example, but instead a challenge you are meant to complete. A full solution can be found on [this StackOverflow question](https://stackoverflow.com/questions/12224042/go-tour-exercise-7-binary-trees-equivalence). ```go package main import "fmt" import "golang.org/x/tour/tree" // Walk walks the tree t sending all values // from the tree to the channel ch. func Walk(t *tree.Tree, ch chan int) { var walker func(t *tree.Tree) walker = func (t *tree.Tree) { if (t == nil) { return } walker(t.Left) ch <- t.Value walker(t.Right) } walker(t) close(ch) } // Same determines whether the trees // t1 and t2 contain the same values. func Same(t1, t2 *tree.Tree) bool { ch1, ch2 := make(chan int), make(chan int) go Walk(t1, ch1) go Walk(t2, ch2) for { v1,ok1 := <- ch1 v2,ok2 := <- ch2 if v1 != v2 || ok1 != ok2 { return false } if !ok1 { break } } return true } func main() { fmt.Println("1 and 1 same: ", Same(tree.New(1), tree.New(1))) fmt.Println("1 and 2 same: ", Same(tree.New(1), tree.New(2))) } ``` Where the `Tree` type is defined seperately [here](https://cs.opensource.google/go/x/tour/+/master:tree/tree.go). ```go // Copyright 2011 The Go Authors. All rights reserved. // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. package tree // import "golang.org/x/tour/tree" import ( "fmt" "math/rand" ) // A Tree is a binary tree with integer values. type Tree struct { Left *Tree Value int Right *Tree } // New returns a new, random binary tree holding the values k, 2k, ..., 10k. func New(k int) *Tree { var t *Tree for _, v := range rand.Perm(10) { t = insert(t, (1+v)*k) } return t } func insert(t *Tree, v int) *Tree { if t == nil { return &Tree{nil, v, nil} } if v < t.Value { t.Left = insert(t.Left, v) } else { t.Right = insert(t.Right, v) } return t } func (t *Tree) String() string { if t == nil { return "()" } s := "" if t.Left != nil { s += t.Left.String() + " " } s += fmt.Sprint(t.Value) if t.Right != nil { s += " " + t.Right.String() } return "(" + s + ")" } ``` So before touching the concurrency bits we need to translate this `Tree` type. ```java import java.util.Collections; import java.util.stream.Collectors; import java.util.stream.IntStream; public sealed interface Tree { Tree insert(int v); record NotEmpty( Tree left, int value, Tree right ) implements Tree { @Override public Tree insert(int v) { if (v < this.value) { return new NotEmpty( this.left.insert(v), this.value, this.right ); } else { return new NotEmpty( this.left, this.value, this.right.insert(v) ); } } @Override public String toString() { return "( " + this.left + this.value + this.right + " )"; } } record Empty() implements Tree { @Override public Tree insert(int v) { return new NotEmpty(new Empty(), v, new Empty()); } @Override public String toString() { return ""; } } static Tree random(int k) { var vs = IntStream.range(0, 10) .boxed() .collect(Collectors.toList()); Collections.shuffle(vs); Tree t = new Empty(); for (int v : vs) { t = t.insert((1 + v) * k); } return t; } } ``` A 1-1 translation of the Go wouldn't be fun Java, so I opted to translate it instead to an immutable sum type. This won't affect the concurrent part other than a stronger conceptual guarentee that we can safely share the tree across multiple threads. With this version the Go maps pretty straight forwardly to this. ```java import java.util.concurrent.ArrayBlockingQueue; import java.util.concurrent.BlockingQueue; import java.util.concurrent.Executors; sealed interface TakeResult<T> { record GotValue<T>(T value) implements TakeResult<T> {} record NoValue<T>() implements TakeResult<T> {} } public final class TreeWalker { private TreeWalker() {} private static void walkHelper( Tree tree, BlockingQueue<TakeResult<Integer>> queue ) throws InterruptedException { if (tree == null) { return; } walkHelper(tree.left, queue); queue.put(new TakeResult.GotValue<>( tree.value )); walkHelper(tree.right, queue); } static void walk( Tree tree, BlockingQueue<TakeResult<Integer>> queue ) throws InterruptedException { walkHelper(tree, queue); queue.put(new TakeResult.NoValue<>()); } static boolean same(Tree t1, Tree t2) throws InterruptedException { var queue1 = new ArrayBlockingQueue<TakeResult<Integer>>(1); var queue2 = new ArrayBlockingQueue<TakeResult<Integer>>(1); var executor = Executors.newVirtualThreadPerTaskExecutor(); try { executor.submit(() -> { walk(t1, queue1); return null; }); executor.submit(() -> { walk(t2, queue2); return null; }); while (true) { var result1 = queue1.take(); var result2 = queue2.take(); if (!result1.equals(result2)) { return false; } if (result1 instanceof TakeResult.NoValue<Integer>) { break; } } return true; } finally { executor.shutdownNow(); executor.close(); } } public static void main(String[] args) throws InterruptedException { System.out.println( "1 and 1 same: " + same(Tree.random(1), Tree.random(1)) ); System.out.println( "1 and 2 same: " + same(Tree.random(1), Tree.random(2)) ); } } ``` ```java TreeWalker.main(new String[]{}); ``` We use the same tricks as before to emulate a closable queue with `TakeResult`. Then we translate the `select` statement to a loop calling `offer` and `poll`. The example Go solution had a recursive local closure for walk. While technically possible via some wizardry, its more straight forward in Java to make a helper method. There is also a reliance on the walk tasks responding correctly to `shutdownNow`. If they did not, `executor.close()` would hang and the scope wouldn't exit. ## Example 8: sync.Mutex https://go.dev/tour/concurrency/9 ```go package main import ( "fmt" "sync" "time" ) // SafeCounter is safe to use concurrently. type SafeCounter struct { mu sync.Mutex v map[string]int } // Inc increments the counter for the given key. func (c *SafeCounter) Inc(key string) { c.mu.Lock() // Lock so only one goroutine at a time can access the map c.v. c.v[key]++ c.mu.Unlock() } // Value returns the current value of the counter for the given key. func (c *SafeCounter) Value(key string) int { c.mu.Lock() // Lock so only one goroutine at a time can access the map c.v. defer c.mu.Unlock() return c.v[key] } func main() { c := SafeCounter{v: make(map[string]int)} for i := 0; i < 1000; i++ { go c.Inc("somekey") } time.Sleep(time.Second) fmt.Println(c.Value("somekey")) } ``` Java has a direct analogue to [`sync.Mutex`](https://pkg.go.dev/sync#Mutex) in [`ReentrantLock`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/locks/ReentrantLock.html). We can make this same program without much issue. ```java import java.time.Duration; import java.util.HashMap; import java.util.Map; import java.util.concurrent.locks.ReentrantLock; final class SafeCounter { private final Map<String, Integer> v; private final ReentrantLock lock; public SafeCounter() { this.v = new HashMap<>(); this.lock = new ReentrantLock(); } void inc(String key) { lock.lock(); try { v.put(key, v.getOrDefault(key, 0) + 1); } finally { lock.unlock(); } } int value(String key) { lock.lock(); try { return v.getOrDefault(key, 0); } finally { lock.unlock(); } } } public final class Mutex { private Mutex() {} public static void main(String[] args) throws InterruptedException { var c = new SafeCounter(); for (int i = 0; i < 1000; i++) { Thread.startVirtualThread( () -> c.inc("somekey") ); } Thread.sleep(Duration.ofSeconds(1)); System.out.println(c.value("somekey")); } } ``` ```java Mutex.main(new String[]{}); ``` The only thing of note is that unlike Go where you can `defer` some arbitrary action like releasing your hold on a lock, in Java the general mechanism for "cleanup that must happen" is using the `finally` clause of a `try` block. For Java [`ReentrantLock`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/locks/ReentrantLock.html) is special in that its locks can "escape" a lexical scope. You can lock before entering a method and unlock in a totally unrelated one. If you don't need this ability then you can use the fact that every "identity" having object in Java can be used as a lock with `synchronized`. ```java final class SafeCounter { private final Map<String, Integer> v; public SafeCounter() { this.v = new HashMap<>(); } void inc(String key) { synchronized (this) { v.put(key, v.getOrDefault(key, 0) + 1); } } int value(String key) { synchronized (this) { return v.getOrDefault(key, 0); } } } ``` If the thing being synchronized on is just `this` and that synchronization lasts the entire scope of the method, we can just mark the method as `synchronized` for the same effect. ```java final class SafeCounter { private final Map<String, Integer> v; public SafeCounter() { this.v = new HashMap<>(); } synchronized void inc(String key) { v.put(key, v.getOrDefault(key, 0) + 1); } synchronized int value(String key) { return v.getOrDefault(key, 0); } } ``` ## Example 9: Web Crawler https://go.dev/tour/concurrency/10 ```go package main import ( "fmt" ) type Fetcher interface { // Fetch returns the body of URL and // a slice of URLs found on that page. Fetch(url string) (body string, urls []string, err error) } // Crawl uses fetcher to recursively crawl // pages starting with url, to a maximum of depth. func Crawl(url string, depth int, fetcher Fetcher) { // TODO: Fetch URLs in parallel. // TODO: Don't fetch the same URL twice. // This implementation doesn't do either: if depth <= 0 { return } body, urls, err := fetcher.Fetch(url) if err != nil { fmt.Println(err) return } fmt.Printf("found: %s %q\n", url, body) for _, u := range urls { Crawl(u, depth-1, fetcher) } return } func main() { Crawl("https://golang.org/", 4, fetcher) } // fakeFetcher is Fetcher that returns canned results. type fakeFetcher map[string]*fakeResult type fakeResult struct { body string urls []string } func (f fakeFetcher) Fetch(url string) (string, []string, error) { if res, ok := f[url]; ok { return res.body, res.urls, nil } return "", nil, fmt.Errorf("not found: %s", url) } // fetcher is a populated fakeFetcher. var fetcher = fakeFetcher{ "https://golang.org/": &fakeResult{ "The Go Programming Language", []string{ "https://golang.org/pkg/", "https://golang.org/cmd/", }, }, "https://golang.org/pkg/": &fakeResult{ "Packages", []string{ "https://golang.org/", "https://golang.org/cmd/", "https://golang.org/pkg/fmt/", "https://golang.org/pkg/os/", }, }, "https://golang.org/pkg/fmt/": &fakeResult{ "Package fmt", []string{ "https://golang.org/", "https://golang.org/pkg/", }, }, "https://golang.org/pkg/os/": &fakeResult{ "Package os", []string{ "https://golang.org/", "https://golang.org/pkg/", }, }, } ``` Another challenge problem. Before going to a reference solution, I am going to translate the synchronous example. Go has a pattern of returning an error as a conditionally filled in extra return value. This isn't super idiomatic in Java, so instead I am going to model the error case as an `Exception`. The Go version also returns both a body and an array of urls from a single call. For Java we can accomplish this by making an aggregate containing both values. ```java final class FetcherException extends Exception { FetcherException(String message) { super(message); } } interface Fetcher { record Result(String body, String[] urls) { } Result fetch(String url) throws FetcherException; } ``` We can make a "fake" implementation of fetcher using the same technique as Go by backing it with an in-memory map. ```java import java.util.Map; final class FakeFetcher implements Fetcher { private final Map<String, Result> results; public FakeFetcher(Map<String, Result> results) { this.results = results; } @Override public Result fetch(String url) throws FetcherException { var result = this.results.get(url); if (result == null) { throw new FetcherException("Not Found: " + url); } else { return result; } } public static Fetcher example() { return new FakeFetcher(Map.of( "https://golang.org/", new Fetcher.Result( "The Go Programming Language", new String[]{ "https://golang.org/pkg/", "https://golang.org/cmd/" } ), "https://golang.org/pkg/", new Fetcher.Result( "Packages", new String[]{ "https://golang.org/", "https://golang.org/cmd/", "https://golang.org/pkg/fmt/", "https://golang.org/pkg/os/", } ), "https://golang.org/pkg/fmt/", new Fetcher.Result( "Package fmt", new String[]{ "https://golang.org/", "https://golang.org/pkg/", } ), "https://golang.org/pkg/os/", new Fetcher.Result( "Package os", new String[]{ "https://golang.org/", "https://golang.org/pkg/", } ) )); } } ``` Then the synchronous fetcher is just a regular recursive function as it was in Go. ```java public final class WebCrawler { private WebCrawler() { } static void crawl( String url, int depth, Fetcher fetcher ) { if (depth <= 0) { return; } Fetcher.Result result; try { result = fetcher.fetch(url); } catch (FetcherException e) { System.out.println(e.getMessage()); return; } var body = result.body(); var urls = result.urls(); System.out.printf( "Found: %s %s\n", body, Arrays.toString(urls) ); for (var u : urls) { crawl(u, depth - 1, fetcher); } } public static void main(String[] args) { var fetcher = FakeFetcher.example(); crawl("https://golang.org/", 4, fetcher); } } ``` ```java WebCrawler.main(new String[]{}); ``` Now I am going to pull an answer to the exercise from [this StackOverflow post](https://stackoverflow.com/questions/13217547/tour-of-go-exercise-10-crawler). ```go // SafeUrlMap is safe to use concurrently. type SafeUrlMap struct { v map[string]string mux sync.Mutex } func (c *SafeUrlMap) Set(key string, body string) { c.mux.Lock() // Lock so only one goroutine at a time can access the map c.v. c.v[key] = body c.mux.Unlock() } // Value returns mapped value for the given key. func (c *SafeUrlMap) Value(key string) (string, bool) { c.mux.Lock() // Lock so only one goroutine at a time can access the map c.v. defer c.mux.Unlock() val, ok := c.v[key] return val, ok } // Crawl uses fetcher to recursively crawl // pages starting with url, to a maximum of depth. func Crawl(url string, depth int, fetcher Fetcher, urlMap SafeUrlMap) { defer wg.Done() urlMap.Set(url, body) if depth <= 0 { return } body, urls, err := fetcher.Fetch(url) if err != nil { fmt.Println(err) return } for _, u := range urls { if _, ok := urlMap.Value(u); !ok { wg.Add(1) go Crawl(u, depth-1, fetcher, urlMap) } } return } var wg sync.WaitGroup func main() { urlMap := SafeUrlMap{v: make(map[string]string)} wg.Add(1) go Crawl("http://golang.org/", 4, fetcher, urlMap) wg.Wait() for url := range urlMap.v { body, _ := urlMap.Value(url) fmt.Printf("found: %s %q\n", url, body) } } ``` This solution makes use of a [`sync.WaitGroup`](https://pkg.go.dev/sync#WaitGroup). There is no direct analogue in Java, but we can pretty easily make something that has similar semantics. I am taking the implementation [from this StackOverflow question](https://stackoverflow.com/questions/29655531/what-is-the-java-equivalent-of-golangs-waitgroup). ```java final class WaitGroup { private int jobs = 0; public synchronized void add(int i) { jobs += i; } public synchronized void done() { if (--jobs == 0) { notifyAll(); } } public synchronized void await() throws InterruptedException { while (jobs > 0) { wait(); } } } ``` The `SafeUrlMap` type is also fairly trivial to assemble, but instead of that I am going to pass around a normal [`HashSet`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/HashSet.html) and manually synchronize on it. There are many other options, including using [`Collections#synchronizedSet`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/Collections.html#synchronizedSet(java.util.Set)) or wrapping a [`ConcurrentHashMap`](https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/concurrent/ConcurrentHashMap.html), but I think this will be the easiest to follow. ```java public final class WebCrawler { private WebCrawler() { } static void crawlTask( String url, int depth, Fetcher fetcher, ExecutorService executor, Set<String> seen, WaitGroup waitGroup ) { try { if (depth <= 0) { return; } Fetcher.Result result; try { result = fetcher.fetch(url); } catch (FetcherException e) { System.out.println(e.getMessage()); return; } var body = result.body(); var urls = result.urls(); System.out.printf( "Found: %s %s\n", body, Arrays.toString(urls) ); for (var u : urls) { synchronized (seen) { if (!seen.contains(u)) { seen.add(u); waitGroup.add(1); executor.submit(() -> crawlTask( u, depth - 1, fetcher, executor, seen, waitGroup )); } } } } finally { waitGroup.done(); } } static void crawl(String url, int depth, Fetcher fetcher) throws InterruptedException { try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { var waitGroup = new WaitGroup(); waitGroup.add(1); executor.submit(() -> crawlTask( url, depth, fetcher, executor, new HashSet<>(), waitGroup )); waitGroup.await(); } } public static void main(String[] args) throws InterruptedException { var fetcher = FakeFetcher.example(); crawl("https://golang.org/", 4, fetcher); } } ``` ```java WebCrawler.main(new String[]{}); ``` We could stop there, but there is one api that is in an incubator module in the early access builds that can replace our home rolled `WaitGroup`. This is why I had `--add-modules=jdk.incubator.concurrent` in the jshell command up top. A `StructuredTaskScope.ShutdownOnFailure` lets us submit an arbitrary number of tasks to the scope recursively and will only close after either all those tasks are complete. There is another implementation `StructuredTaskScope.ShutdownOnSuccess` that will finish after a single task succeeds. This obviates the need to manually count up and down with a `WaitGroup`. ```java import java.util.Arrays; import java.util.HashSet; import java.util.Set; import jdk.incubator.concurrent.StructuredTaskScope; public final class WebCrawler { private WebCrawler() { } static void crawlTask( String url, int depth, Fetcher fetcher, StructuredTaskScope.ShutdownOnFailure structuredTaskScope, Set<String> seen ) { if (depth <= 0) { return; } Fetcher.Result result; try { result = fetcher.fetch(url); } catch (FetcherException e) { System.out.println(e.getMessage()); return; } var body = result.body(); var urls = result.urls(); System.out.printf( "Found: %s %s\n", body, Arrays.toString(urls) ); for (var u : urls) { synchronized (seen) { if (!seen.contains(u)) { seen.add(u); structuredTaskScope.fork(() -> { crawlTask( u, depth - 1, fetcher, structuredTaskScope, seen ); return null; }); } } } } static void crawl(String url, int depth, Fetcher fetcher) throws InterruptedException { try (var structuredTaskScope = new StructuredTaskScope.ShutdownOnFailure()) { structuredTaskScope.fork(() -> { crawlTask( url, depth, fetcher, structuredTaskScope, new HashSet<>() ); return null; }); structuredTaskScope.join(); } } public static void main(String[] args) throws InterruptedException { var fetcher = FakeFetcher.example(); crawl("https://golang.org/", 4, fetcher); } } ``` ```java WebCrawler.main(new String[]{}); ``` This combines the role of the `ExecutorService` and the `WorkGroup` into one object which at the least makes for one less point of coordination and slightly cleaner code. The exact shape the `StructuredTaskScope` api will take is very much in flux so if you are reading this a year or so in the future this snippet might not work. ## Wrapping up Hopefully this was informative. If not thats fine too. Leave questions, comments, and suggestions in the comments. Mon, 02 May 0022 05:00:00 +0000Java Serialization is Funhttps://mccue.dev/pages/4-28-22-serialization ## What is serialization Serialization is the process of taking an in-memory representation of data and transforming it to a representation suitable for sending to another location. <img src="/pages/3-15-22-serialization-diagram.svg" alt="Diagram showing the serialization flow" ></img> Deserialization is the reverse of that process. Code takes a structured representation of data from some location and transforms it to a representation in-memory. <img src="/pages/3-15-22-deserialization-diagram.svg" alt="Diagram showing the deserialization flow"></img> Every programming language has a myriad of approaches for performing these tasks. These approaches vary greatly depending on the semantics of the language, the semantics of the output format, and the culture surrounding both. What sets Java's serialization mechanism apart is that the semantics of the language map extremely closely to that of the output format. To fully appreciate the implications of this, allow me to take you on a bit of a tour of some other data formats. ## CSV CSV, Comma Separated Values, is one of the most "basic" data formats out there. Data is written one line at a time, with each value in a "row" separated by commas. ```csv frankie,25,yes,Jun 8, 2023 casca,63,no,none ``` By convention sometimes the very first row is used to store a "label" of what each "column" means. ```csv First Name,Number of Cats,Tax Fraud?,Upcoming Court Date frankie,25,yes,Jun 8, 2023 casca,63,no,none ``` While labels can add contextual information, the actual "data model" that is directly encoded here is just rows of strings. Interpretation of these rows is dependent on a combination of convention and "out of band" information. ``` CSV is - a list of Rows A Row is - a list of strings ``` CSV is popular in quite a few domains. It's easy to import and export to Spreadsheets, write out from sensors on an Arduino, and feed into Machine Learning libraries. But its data model is not close to how most programs represent data. To go from a representation in memory to CSV is most always going to be a "lossy" process. To go from CSV back to that same representation in memory is requires knowledge about how to interpret the order of elements in a row, what each element means, etc. ```java import java.time.LocalDate; record Person( // have to assume that the first element is the name String name, // have to assume that the second element is this int numberOfCats, // How should a boolean be encoded? boolean taxFraud, // What format is the date in? // What is done when no value is known? LocalDate upcomingCourtDate ) { static Person fromCsvRow(List<String> row) { // Code here could be autogenerated if you assume // conventions, but it probably won't be if (row.size() != 4) { ... } String name = row.get(0); int numberOfCats; try { numberOfCats = Integer.parseInt(row.get(1)); } catch (NumberFormatException __) { ... } // ... and so on ... return new Person(name, numberOfCats, ...); } List<String> toCsvRow() { // ... return List.of(this.name, ...); } } ``` ## JSON "JavaScript Object Notation" is a format derived from the syntax of declaring object literals in JavaScript. ```json { "stockName": "IDK", "stockPrice": "100USD", "twitterComments": [ { "retweets": 10, "text": "...", }, { "retweets": 20, "text": "..." } ] } ``` Compared to CSV it is way more expressive. Instead of just rows of strings the data model includes dedicated representations for booleans, numbers, lists, and more. ``` JSON is one of - null - a string - a number - a boolean - a list of JSON - a map of string to JSON ``` This makes it somewhat of a "lowest common denominator" data format. Most modern languages have support for these data types and the structure can represent nested data much more ergonomically than "flat" formats like CSV. The translation from a model in memory to JSON is still "lossy" in quite a few common cases though. ```java record Recruiter( // Often enums will be translated to Strings TellsYouTheSalary tellsYouTheSalary, // Times might be put into a ISO-8601 format String // or a Unix Time integer Instant postedFirstCringeStatus, // Sets aren't representable, so often // they will be encoded as lists Set<ReservationsAtDorsier> reservations, // Multiple possibilities with overlapping fields need a // convention for representing which is present LovedOne lovedOne ) {} enum TellsYouTheSalary { UP_FRONT, IF_YOU_ASK, NEVER } sealed interface LovedOne {} record Cat(String name) implements LovedOne {} record Dog(String name) implements LovedOne {} record NoOne() implements LovedOne {} // Both of these would be valid representations // depending on your conventions // // { "tellsYouTheSalary": "UP_FRONT", // "postedFirstCringeStatus": 1234, // "reservations": [], // "lovedOne": {"type": "cat", "name": "fred" } } // // { "tells_you_the_salary": "up_front", // "posted_first_cringe_status": "2020-07-10 15:00:00.000", // "reservations": {"kind": "set", "contents": []}, // "loved_one": {"kind": "cat", "name": "fred"} } ``` ## EDN "Extensible Data Notation" is a format that came out of the syntax of the Clojure programming language. ```clojure { :teethLeft #{5 12 14 23} :countryOfOrigin "United States of America" :whelped #inst "2006-04-12T00:00:00.000-00:00" :parents #{#pokemon "Skitty" #pokemon "Wailord"} :moves [:quick-attack :tail-whip] } ``` More likely than not you have not heard of it. That's a shame because it's pretty cool. Compared to JSON it has a larger base set of types and a defined mechanism for extending that set. ``` EDN is one of - null - a string - an integer - a vector of EDN - a map of EDN to EDN - a set of EDN - a keyword - a symbol - an element with a tag and an EDN value ... and a few other base types ... ``` The key capability for the purposes of this discussion is that you are able to attach an arbitrary tag to any EDN value. This serves the same purpose as the `{ "type": ..., "data": ... }` pattern in JSON, but by virtue of being part of the format that encoding is not "positional". As an example of what I mean, in JSON the way you know that a given field contains a moment in time is by knowing implicitly that the string under a specific name like "createdAt" will be formatted in as a timestamp. ```json { "createdAt": "2020-08-12T00:00:00.000-00:00" } ``` In EDN if you know how a given tag like `#inst` should be interpreted then you can automatically do that interpretation no matter where in the structure of the document it appears. ```clojure { "createdAt" #inst"2020-08-12T00:00:00.000-00:00" } ``` This means that translation to and from EDN doesn't have to be lossy in the same way JSON serialization is. If you have a custom aggregate, you can define a tag for that aggregate and include whatever data is needed to reconstruct it ```java package some.pack; sealed interface Mascot {} record Gecko(int age) {} record Sailor(int age, boolean captain) {} // This could be encoded as // #some.pack.Gecko{:age 12} // #some.pack.Sailor{:age 35 :captain true} ``` You can also have non-string keys `{{:map "key"} "whatever value"}`. Y'all are missing out. ## Java's Serialization Format "Java Serialization" is a mechanism by which any object in memory can be serialized to and deserialized from a sequence of bytes while preserving the same semantics that object had in memory. For regular classes, it accomplishes this by recursively scraping the fields of the class and producing bytes as specified [here](https://docs.oracle.com/en/java/javase/18/docs/specs/serialization/protocol.html). Then when the bytes are read back in, it reconstructs the object by doing the reverse. For "special" classes (Strings, Enums, and Records) there are slightly different rules, but the effect is essentially the same. This is exceedingly hard to properly communicate with words, so here is a quick walk-through. You can follow along by pasting each snippet into [JShell](https://tryjshell.org/). (If you have Java installed, run `jshell` on the command line) ### Step 1. Make a Serializable class Implement the `Serializable` marker interface and make sure every field of your class does as well or is a primitive. ```java import java.io.Serializable; public class LabeledPosition implements Serializable { private String label; private int x; private int y; public LabeledPosition(String label, int x, int y) { this.label = label; this.x = x; this.y = y; } @Override public String toString() { return "LabeledPosition[label=" + this.label + ", x=" + this.x + ", y=" + this.y + "]"; } } ``` ### Step 2. Make an ObjectOutputStream You can make this special class by wrapping any existing `OutputStream`. This is where the bytes of your serialized form will be written. ```java import java.io.ByteArrayOutputStream; import java.io.ObjectOutputStream; var byteArrayOutputStream = new ByteArrayOutputStream(); var objectOutputStream = new ObjectOutputStream( byteArrayOutputStream ); ``` ### Step 3. Write your object to the ObjectOutputStream This is a binary format, so there isn't any fun visual aid, but you can inspect and see that indeed we have written some bytes. ```java objectOutputStream.writeObject(new LabeledPosition("bob", 9, 1)); byte[] bytes = byteArrayOutputStream.toByteArray(); System.out.println(Arrays.toString(bytes)); // [-84, -19, 0, ..., 98, 111, 98] ``` ### Step 4. Create an ObjectInputStream This is very similar to how we wrote the object out. Wrap any existing `InputStream`. ```java import java.io.ByteArrayInputStream; import java.io.ObjectInputStream; var byteArrayInputStream = new ByteArrayInputStream(bytes); var objectInputStream = new ObjectInputStream(byteArrayInputStream); ``` ### Step 5. Read in the object you wrote out ```java var labeledPosition = (LabeledPosition) objectInputStream.readObject(); System.out.println(labeledPosition); // LabeledPosition[label=bob, x=9, y=1] ``` ### Step 6. Make another Serializable class Hold with me here, this gets good. ```java record TwoLists( List<Integer> listOne, List<Integer> listTwo ) implements Serializable {} ``` ### Step 7. Make a mutable object So here we will make an instance of this TwoLists record where each List is the exact same list in memory. This means that if we add to either `listOne` or `listTwo` both will be updated. ```java var theList = new ArrayList<>(List.of(1, 2, 3)); var twoLists = new TwoLists(theList, theList); System.out.println(twoLists); // TwoLists[listOne=[1, 2, 3], listTwo=[1, 2, 3]] twoLists.listOne().add(4); System.out.println(twoLists); // TwoLists[listOne=[1, 2, 3, 4], listTwo=[1, 2, 3, 4]] ``` ### Step 8. Write that mutable object to an ObjectOutputStream ```java var byteArrayOutputStream = new ByteArrayOutputStream(); var objectOutputStream = new ObjectOutputStream( byteArrayOutputStream ); objectOutputStream.writeObject(twoLists); byte[] bytes = byteArrayOutputStream.toByteArray(); ``` ### Step 9. Read that mutable object from an ObjectInputStream ```java var byteArrayInputStream = new ByteArrayInputStream(bytes); var objectInputStream = new ObjectInputStream(byteArrayInputStream); var roundTripped = (TwoLists) objectInputStream.readObject(); ``` ### Step 10. Oh no Oh yeah. ```java System.out.println(roundTripped); // TwoLists[listOne=[1, 2, 3, 4], listTwo=[1, 2, 3, 4]] System.out.println(roundTripped.listOne() == roundTripped.listTwo()); // true roundTripped.listOne().add(5); System.out.println(roundTripped); // TwoLists[listOne=[1, 2, 3, 4, 5], listTwo=[1, 2, 3, 4, 5]] ``` If you have the same object two places in the "object graph" of something you are serializing, the fact that those two places hold the same object is preserved. Because of this, you can even seamlessly serialize things like circular linked lists. ```java class CircularThing implements Serializable { CircularThing next; } // How would you write this in JSON? var circular = new CircularThing(); circular.next = circular; ``` ## What is this good for? * Prototypes and Projects on a tight deadline. Since you can save any arbitrary object and there is no extra code needed to make that just "work", Java Serialization can be a very useful crutch for getting code working quickly. * Saving arbitrary objects that were very expensive to create In the Python world, [a similar utility](https://docs.python.org/3/library/pickle.html) is often used to save the results of training ML models. It's easy to imagine that Java Serialization could see similar use if Data Science ever took off on the JVM in the same way. * Dynamically sending code to another machine [Spark](https://spark.apache.org/docs/latest/tuning.html#data-serialization) uses this mechanism for distributing Java objects across different nodes. ## What is this bad for? * Models that will change over time While you can version serialized objects, doing so is non-obvious and error-prone. Making a class serializable, especially in a library, can therefore be a fairly large maintenance problem. * Applications where an untrusted entity might be able to send data to your application If you read serialized data that you did not write, that is a giant security hole. There is more nuance to it, but basically if you read untrusted serialized data then any hacker can get full access to your system. I'm not going to go in to every way you can exploit serialization, but [this talk](https://www.youtube.com/watch?v=dOgfWXw9VrI) should give you a basic idea. This was a crucial part of the [Log4Shell](https://mogwailabs.de/en/blog/2021/12/vulnerability-notes-log4shell/) vulnerability. * Things that should be edited by humans like configuration files Because serialized objects are stored in a binary format, it is impossible to read without special tooling and prohibitively hard to write by hand. * Sending data outside the Java/JVM world While technically you could write a parser for the binary format in your language of choice and recover the information, you would likely be the first. If you need to share values with programs in other languages, falling back to a "lowest common denominator" like JSON is a better strategy. --- Part of what made writing this so hard for me is that most people who I've seen be shown serialization were shown it very early in their curriculums. It's hard to explain nuance around the object model and encapsulation when talking to someone who learned what classes are two weeks back, so I left most of that out. Leave a comment below if anything was unclear, incorrect, or you would like to learn more.Thu, 28 Apr 0022 05:00:00 +0000Java's options for optionshttps://mccue.dev/pages/2-8-22-options-for-options # Hypothetical 1 Imagine that you made a bit of code that outputs JSON. ```java public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json ) { ... } } ``` By default your output contains no extra whitespace, but you want to provide an option to the user to print that JSON with some indentation. ### Without indentation ```json [{"name":"joe","age":35}] ``` ### With indentation ```json [ { "name":"joe", "age":35 } ] ``` ## Option 1. Don't support it Any toggles you add to your api are toggles you might need to support now and forever more. Depending on the code you are writing and who its consumers are, it might make more sense to provide a more restricted api. ## Option 2. Make another method With only a single option you want configurable, you can just expose a method with a different name. ```java public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json ) { ... } public static void writeJsonWithIndentation( Appendable out, Json json ) { ... } } ``` ```java writeJson(out, json); writeJsonWithIndentation(out, json); ``` ## Option 3. Add a boolean argument A single option is either on or off. True or false. That is often the domain of a boolean. ```java public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, boolean indent ) { if (indent) { ... } else { ... } } } ``` ```java writeJson(out, json, true); writeJson(out, json, false); ``` ## Option 4. Add an enum argument Booleans are great, but for understandability at the call-site you might want to provide an enum with two possible values instead. ```java public enum Indentation { INDENT, DO_NOT_INDENT } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, Indentation indent ) { switch (indent) { case INDENT -> ... case DO_NOT_INDENT -> ... } } } ``` ```java writeJson(out, json, Indentation.INDENT); writeJson(out, json, Indentation.DO_NOT_INDENT); ``` --- # Hypothetical 2 Say now you get some feedback that while the indentation style is great for objects, it is sometimes not great for JSON with long arrays. ## No indentation ```json [{"numbers":[1,2,3]}] ``` ### Indent Everything ```json [ { "numbers": [ 1, 2, 3 ] } ] ``` ### Indent Objects ```json [{ "numbers": [1, 2, 3] }] ``` ### Indent Arrays ```json [ {"numbers": [ 1, 2, 3 ]} ] ``` ## Option 5. Make methods for requested combinations Your users just want a way to turn off indentation for arrays. Give it to them. ```java public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json ) { ... } public static void writeJsonIndentObjects( Appendable out, Json json ) { ... } public static void writeJsonIndentEverything( Appendable out, Json json ) { ... } } ``` ```java writeJson(out, json); writeJsonIndentObjects(out, json); writeJsonIndentEverything(out, json); ``` ## Option 6. Make methods for every combination There are four logical settings that come out of two different flags, so you can certainly provide all four options as methods. Could save you time later. ```java public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json ) { ... } public static void writeJsonIndentObjects( Appendable out, Json json ) { ... } public static void writeJsonIndentArrays( Appendable out, Json json ) { ... } public static void writeJsonIndentEverything( Appendable out, Json json ) { ... } } ``` ```java writeJson(out, json); writeJsonIndentObjects(out, json); writeJsonIndentArrays(out, json); writeJsonIndentEverything(out, json); ``` ## Option 7. Have two boolean arguments Two logically independent things to configure, you can always take a boolean for each. ```java public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, boolean indentObjects, boolean indentArrays ) { if (indentObjects) { if (indentArrays) { ... } else { ... } } else { ... } } } ``` ```java writeJson(out, json, true, true); writeJson(out, json, true, false); writeJson(out, json, false, true); writeJson(out, json, false, false); ``` ## Option 8. Have two enum arguments Booleans describe everything, but enums are still more explicit. ```java public enum Indentation { INDENT, DO_NOT_INDENT } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, Indentation indentObjects, Indentation indentArrays ) { switch (indentObjects) { case INDENT -> switch (indentArrays) { case INDENT -> ... case DO_NOT_INDENT -> ... } case DO_NOT_INDENT -> ... } } } ``` ```java writeJson(out, json, Indentation.INDENT, Indentation.INDENT); writeJson(out, json, Indentation.INDENT, Indentation.DO_NOT_INDENT); writeJson(out, json, Indentation.DO_NOT_INDENT, Indentation.INDENT); writeJson( out, json, Indentation.DO_NOT_INDENT, Indentation.DO_NOT_INDENT ); ``` ## Option 9. Take options as bit flags It's an old school solution and maybe a bit too clever, but you are feeling old school and clever. ```java public final class Indentation { public static final int NO_INDENTATION = 0b00; public static final int INDENT_OBJECTS = 0b01; public static final int INDENT_ARRAYS = 0b10; private Indentation() {} } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, int indentation ) { if ((indentation & Indentation.INDENT_OBJECTS) != 0) { if ((indentation & Indentation.INDENT_ARRAYS) != 0) { ... } else { ... } } else { ... } } } ``` ``` writeJson( out, json, Indentation.INDENT_OBJECTS | Indentation.INDENT_ARRAYS, ); writeJson(out, json, Indentation.INDENT_OBJECTS); writeJson(out, json, Indentation.INDENT_ARRAYS); writeJson(out, json, Indentation.NO_INDENTATION); ``` ## Option 10. Take an EnumSet Rather than waste a parameter on each flag, explicitly take the set of behaviors they want to enable. ```java public enum Indent { OBJECTS, ARRAYS } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, EnumSet<Indent> indent ) { if (indent.contains(Indent.OBJECTS)) { if (indent.contains(Indent.ARRAYS)) { ... } else { ... } } else { ... } } } ``` ``` writeJson(out, json, EnumSet.of(Indent.OBJECTS, Indent.ARRAYS)); writeJson(out, json, EnumSet.of(Indent.OBJECTS)); writeJson(out, json, EnumSet.of(Indent.ARRAYS)); writeJson(out, json, EnumSet.noneOf(Indent.class)); ``` ## Option 11. Take a transparent config object Similar to just taking two booleans, but putting them in an object means you can refer to a set of options as a concrete "thing". This could help you keep the most common usages terse. ```java record Options(boolean indentObjects, boolean indentArrays) { public static final Options INDENT_EVERYTHING = new Options(true, true); public static final Options NO_INDENT = new Options(false, false); } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, Options options ) { if (options.indentObjects()) { if (options.indentArrays()) { ... } else { ... } } else { ... } } } ``` ``` writeJson(out, json, Options.INDENT_EVERYTHING); writeJson(out, json, new Options(true, false)); writeJson(out, json, new Options(false, true)); writeJson(out, json, Options.NO_INDENT); ``` ## Option 12. Take an opaque config object Maybe you want to give your api some extra wiggle room to grow. Maybe you just like how the usage of an opaque object made from a builder looks. With this approach you can choose to internally represent things as booleans, enums, an enum set, bit flags, or whatever other evil lies within the hearts of mankind. ```java public final class Options { private final boolean indentObjects; private final boolean indentArrays; private Options(Builder builder) { this.indentArrays = builder.indentArrays; this.indentObjects = builder.indentObjects; } public boolean indentArrays() { return this.indentArrays; } public boolean indentObjects() { return this.indentObjects; } public static Options standard() { return builder().build(); } public static Builder builder() { return new Builder(); } public final class Builder { private boolean indentObjects; private boolean indentArrays; private Builder() { this.indentObjects = false; this.indentArrays = false; } public Builder indentObjects() { this.indentObjects = true; return this; } public Builder indentArrays() { this.indentArrays = true; return this; } public Options build() { return new Options(this); } } } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, Options options ) { if (options.indentObjects()) { if (options.indentArrays()) { ... } else { ... } } else { ... } } } ``` ``` writeJson( out, json, Options.builder() .indentObjects() .indentArrays() .build() ); writeJson( out, json, Options.builder() .indentObjects() .build() ); writeJson( out, json, Options.builder() .indentArrays() .build() ); writeJson(out, json, Options.standard()); ``` --- # Hypothetical 3. 🚑 Weewoo Weewoo 🚑 Its legal! Your apis are great and all, but when you send data to external clients we would really like to include an explicit statement of copyright. That copyright message might change depending on your contract with the client and also we shouldn't send it internally. Good luck, legal out. ### With Indentation, Without copyright ```json [ { "name": "joe", "age": 35 } ] ``` ### With indentation, With copyright ```json { "copyright": "(c) 2022 Inc.", "data": [ { "name":"joe", "age":35 } ] } ``` ## Option 13. Add 4 more methods to hit the new combinations ```java public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json ) { ... } public static void writeJsonIndentObjects( Appendable out, Json json ) { ... } public static void writeJsonIndentArrays( Appendable out, Json json ) { ... } public static void writeJsonIndentEverything( Appendable out, Json json ) { ... } public static void writeJsonWithCopyright( Appendable out, Json json, String copyright ) { ... } public static void writeJsonIndentObjectsWithCopyright( Appendable out, Json json, String copyright ) { ... } public static void writeJsonIndentArraysWithCopyright( Appendable out, Json json, String copyright ) { ... } public static void writeJsonIndentEverythingWithCopyright( Appendable out, Json json, String copyright ) { ... } } ``` ```java writeJson(out, json); writeJsonIndentObjects(out, json); writeJsonIndentArrays(out, json); writeJsonIndentEverything(out, json); writeJsonWithCopyright(out, json, "(c) 2022"); writeJsonIndentObjectsWithCopyright(out, json, "(c) 2022"); writeJsonIndentArraysWithCopyright(out, json, "(c) 2022"); writeJsonIndentEverythingWithCopyright(out, json, "(c) 2022"); ``` ## Options 14. Add a single new method If your boolean-like options only took up a single overload, you can get away with just adding a single new method to the list. This will look different depending on whether you used booleans, enums, an `EnumSet`, or bit flags. ```java public enum Indent { OBJECTS, ARRAYS } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, EnumSet<Indent> indent ) { } public static void writeJson( Appendable out, Json json, EnumSet<Indent> indent, String copyright ) { } } ``` ```java writeJson(out, json, EnumSet.of(Indent.OBJECTS, Indent.ARRAYS)); writeJson(out, json, EnumSet.of(Indent.OBJECTS)); writeJson(out, json, EnumSet.of(Indent.ARRAYS)); writeJson(out, json, EnumSet.noneOf(Indent.class)); writeJson( out, json, EnumSet.of(Indent.OBJECTS, Indent.ARRAYS), "(c) 2022" ); writeJson( out, json, EnumSet.of(Indent.OBJECTS), "(c) 2022" ); writeJson( out, json, EnumSet.of(Indent.ARRAYS), "(c) 2022" ); writeJson( out, json, EnumSet.noneOf(Indent.class), "(c) 2022" ); ``` ## Option 15. Add another argument and accept null If you don't want to add yet another overload, you can always just allow users to pass `null`. ```java public enum Indent { OBJECTS, ARRAYS } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, EnumSet<Indent> indent, String copyright ) { if (indent.contains(Indent.OBJECTS)) { if (indent.contains(Indent.ARRAYS)) { if (copyright == null) { ... } else { ... } } else { ... } } else { ... } } } ``` ```java writeJson(out, json, EnumSet.of(Indent.OBJECTS, Indent.ARRAYS), null); writeJson(out, json, EnumSet.of(Indent.OBJECTS), null); writeJson(out, json, EnumSet.of(Indent.ARRAYS), null); writeJson(out, json, EnumSet.noneOf(Indent.class), null); writeJson( out, json, EnumSet.of(Indent.OBJECTS, Indent.ARRAYS), "(c) 2022" ); writeJson(out, json, EnumSet.of(Indent.OBJECTS), "(c) 2022"); writeJson(out, json, EnumSet.of(Indent.ARRAYS), "(c) 2022"); writeJson(out, json, EnumSet.noneOf(Indent.class), "(c) 2022"); ``` ## Option 16. Add another argument and make it be an `Optional` It's time to paint a bike shed. You can do this with `java.util.Optional` or your own sealed type. ```java public enum Indent { OBJECTS, ARRAYS } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, EnumSet<Indent> indent, Optional<String> copyright ) { if (indent.contains(Indent.OBJECTS)) { if (indent.contains(Indent.ARRAYS)) { if (copyright.isPresent()) { ... copyright.orElseThrow() ... } else { ... } } else { ... } } else { ... } } } ``` ```java writeJson( out, json, EnumSet.of(Indent.OBJECTS, Indent.ARRAYS), Optional.empty() ); writeJson( out, json, EnumSet.of(Indent.OBJECTS), Optional.empty() ); writeJson( out, json, EnumSet.of(Indent.ARRAYS), Optional.empty() ); writeJson( out, json, EnumSet.noneOf(Indent.class), Optional.empty() ); writeJson( out, json, EnumSet.of(Indent.OBJECTS, Indent.ARRAYS), Optional.of("(c) 2022") ); writeJson( out, json, EnumSet.of(Indent.OBJECTS), Optional.of("(c) 2022") ); writeJson( out, json, EnumSet.of(Indent.ARRAYS), Optional.of("(c) 2022") ); writeJson( out, json, EnumSet.noneOf(Indent.class), Optional.of("(c) 2022") ); ``` ## Option 17. Add a nullable property to a transparent config object ```java record Options( boolean indentObjects, boolean indentArrays, String copyright) { public static final Options INDENT_EVERYTHING = new Options(true, true, null); public static final Options NO_INDENT = new Options(false, false, null); } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, Options options ) { if (options.indentObjects()) { if (options.indentArrays()) { if (options.copyright() == null) { ... } else { ... } } else { ... } } else { ... } } } ``` ## Option 18. Add an `Optional` property to a transparent config object Same as above, but if you like to explicitly have the property be optional you can, but if you are using records as your transparent data carriers then you need to take an `Optional` in your constructor. ```java record Options( boolean indentObjects, boolean indentArrays, Optional<String> copyright ) { public static final Options INDENT_EVERYTHING = new Options(true, true, Optional.empty()); public static final Options NO_INDENT = new Options(false, false, Optional.empty()); } ``` ## Option 19. Add another property to an opaque config object ```java public final class Options { private final boolean indentObjects; private final boolean indentArrays; private final String copyright; private Options(Builder builder) { this.indentArrays = builder.indentArrays; this.indentObjects = builder.indentObjects; this.copyright = builder.copyright; } public boolean indentArrays() { return this.indentArrays; } public boolean indentObjects() { return this.indentObjects; } public Optional<String> copyright() { return Optional.ofNullable(this.copyright); } public static Options standard() { return builder().build(); } public static Builder builder() { return new Builder(); } public final class Builder { private boolean indentObjects; private boolean indentArrays; private String copyright; private Builder() { this.indentObjects = false; this.indentArrays = false; this.copyright = null; } public Builder indentObjects() { this.indentObjects = true; return this; } public Builder indentArrays() { this.indentArrays = true; return this; } public Builder copyright(String copyright) { this.copyright = copyright; return this; } public Options build() { return new Options(this); } } } ``` ```java writeJson( out, json, Options.builder() .indentObjects() .indentArrays() .build() ); writeJson( out, json, Options.builder() .indentObjects() .build() ); writeJson( out, json, Options.builder() .indentArrays() .build() ); writeJson(out, json, Options.standard()); writeJson( out, json, Options.builder() .indentObjects() .indentArrays() .copyright("(c) 2022") .build() ); writeJson( out, json, Options.builder() .indentObjects() .copyright("(c) 2022") .build() ); writeJson( out, json, Options.builder() .indentArrays() .copyright("(c) 2022") .build() ); writeJson( out, json, Options.builder() .copyright("(c) 2022") .build() ); ``` --- # Hypothetical 4. It would be a lot more efficient if you started sending your JSON in a binary format like MessagePack. It has the same data model as JSON, so it should work out. Also, when sending in that binary format there is a choice between "Little Endian" and "Big Endian". Problem is, there really isn't a meaning to indentation in a binary format or to endianness in a text one. ## Option 20. Make separate methods For a split as fundamental as this, it might make sense to start to make an entirely separate API for the new JSON-like format. ```java public enum Indent { OBJECTS, ARRAYS } ... enum Endianness { BIG_ENDIAN, LITTLE_ENDIAN } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, EnumSet<Indent> indent, Optional<String> copyright ) { ... } public static void writeMessagePack( Appendable out, Json json, Endianness endianness, Optional<String> copyright ) { ... } } ``` ```java writeJson( out, json, EnumSet.of(Indent.OBJECTS), Optional.of("(c) 2022") ); writeMessagePack( out, json, Endianness.BIG_ENDIAN, Optional.of("(c) 2022") ); ``` ## Option 21. Make an interface and use dispatch You were surprised you didn't think of this first. Dynamic dispatch is some classic Java stylings. ```java public interface JsonWriter { void write(Appendable out, JSON json); } ... public final class TextJsonWriter implements JsonWriter { private final boolean indentObjects; private final boolean indentArrays; private final String copyright; public TextJsonWriter( boolean indentObjects, boolean indentArrays, String copyright ) {} @Override public void write(Appendable out, JSON json) { ... } } ... enum Endianness { BIG_ENDIAN, LITTLE_ENDIAN } ... public final class BinaryJsonWriter implements JsonWriter { private final Endianness endianness; private final String copyright; public BinaryJsonWriter( Endianness endianness, String copyright ) {} @Override public void write(Appendable out, JSON json) { ... } } ``` ```jshelllanguage new BinaryJsonWriter( Endianness.BIG_ENDIAN, "(c) 2022" ).writeJson(out, json); new TextJsonWriter( true, false, "(c) 2022" ).writeJson(out, json); ``` ## Option 22. Take everything as an object and figure it out at runtime. You need to choose whether you silently ignore bad combinations of objects and what behaviors get preference, but there is a simplicity to just throwing it all into a record or opaque object and figuring it out from there. ```java enum Endianness { BIG_ENDIAN, LITTLE_ENDIAN } record Options( Boolean indentObjects, Boolean indentArrays, String copyright, boolean useBinary, Endianness endianness ) { public static final Options INDENT_EVERYTHING = new Options( true, true, null, false, null ); public static final Options NO_INDENT = new Options( false, false, null, false, null ); public static final Options BINARY_LE = new Options( null, null, null, true, Endianness.LITTLE_ENDIAN ); } ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, Options options ) { if (options.useBinary() && (options.indentArrays() != null || options.indentObjects() != null)) { // ignore or throw ... } else if (!options.useBinary() && options.endianness() != null) { ... } else { ... } } } ``` ```java writeJson( out, json, Options.INDENT_EVERYTHING ); writeJson( out, json, Options.BINARY_LE ); writeJson( out, json, new Options( null, null, null, true, Endianness.BIG_ENDIAN ) ); ``` ## Option 23. Model valid choices in the type hierarchy. With a bit of restructuring, you can actually make an `Options` object that will correctly handle having that disjoint set of options. Maybe not what you would choose with 100 settings or more complicated legality restrictions, but for this case it all seems to work out. ```java enum Endianness { BIG_ENDIAN, LITTLE_ENDIAN } sealed interface Options permits BinaryOptions, TextOptions { Optional<String> copyright(); } record TextOptions( @Override Optional<String> copyright, boolean indentObjects, boolean indentArrays ) implements Options {} record BinaryOptions( @Override Optional<String> copyright, Endianness endianness ) implements Options {} ... public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, Options options ) { switch (options) { case TextOptions textOptions -> { ... } case BinaryOptions binaryOptions -> switch (binaryOptions.endianness()) { case BIG_ENDIAN -> ... case LITTLE_ENDIAN -> ... } } } } ``` ```java writeJson( out, json, new TextOptions(Optional.of("(c) 2022"), true, false) ); writeJson( out, json, new BinaryOptions(Optional.empty(), Endianness.BIG_ENDIAN) ); ``` ## Option 24. Give up on typing it, just pass a map This was always an option. It works just as well here as it does in a dynamic language, it's just a tad more verbose and unsafe. ```java public final class JsonWriter { private JsonWriter() {} public static void writeJson( Appendable out, Json json, Map<String, Object> options ) { var copyright = options.get("copyright"); if (copyright == null) { ... var endianness = options.get("binary"); ... } else if (copyright instanceof String copyrightString) { ... } else { throw new IllegalArgumentException(...); } } } ``` ```java writeJson( out, json, Map.of( "indentObjects", true, "copyright", "(c) 2022" ) ); writeJson( out, json, Map.of( "binary", true, "endianness", Endianness.BIG_ENDIAN ) ); ``` --- # Hypothetical 5. You've taken the mouse to the movies. People don't need all the configuration options you've provided and don't like using the API that has them. They want a simpler API. Maybe you should have gone with option 1. --- Exercise for the reader. Make a spreadsheet of all these options versus all the criteria you use to evaluate software and fill in the grid with smiley faces, frowny faces, and meh faces. Feel free to fill in some options I missed, like reading from environment variables, system properties, or more inheritance schemes. Tue, 08 Feb 0022 05:00:00 +0000Why Java didn't add flow typinghttps://mccue.dev/pages/1-25-22-method-overload-with-new-switch Java's method priority rules are part of the reason why pattern matching works the way it does. Say you had code that you wrote in the past 30 years that looks like this ```java public class Example{ public static void main(String[] args){ Number i = 3; if (i instanceof Integer) { doTask(i); } } public static void doTask(Integer i){ System.out.println("integer"); } public static void doTask(Number i){ System.out.println("number"); } } ``` This will output `number` even though `i` is an `Integer` because the stated type of `i` is `Number`. In other languages like kotlin, if you check if `i` is an `Integer`, within the block of that if it will be considered an integer by the compiler. Since java has legacy code that might have done something like this, you have to explicitly name a variable that will be considered an `Integer` after the check ```java Number number = 3; if (number instanceof Integer integer) { doTask(integer); // will select the integer overload doTask(number); // will select the number overload // integer and number both refer to the same object // so this will output true System.out.println(integer == number); } ``` While this might feel like a nuisance, structuring the language feature like this enables more generic "pattern matching" syntax that you can read about here (coming in the next few years) https://openjdk.java.net/jeps/405 Tue, 25 Jan 0022 05:00:00 +0000The switch that only I likehttps://mccue.dev/pages/1-24-22-silly-switch One thing that I think is really cool, but no one else cares about, is that if you make a sealed interface which only has one implementor (say, if you are enriching a class via generated code) then there is actually a way for you to make a "type safe cast" between the two things. What i mean by this is, consider a normal cast ```java Integer i = 3; Object o = i; String s = (String) o; // boom, ClassCastException ``` Normal casts are allowed by the language to fail. If you did not properly track the types of things and you try to cast to a type you are not compatible with your code will still compile, but it will crash at runtime. Even if you write an interface that has a single implementor, the language can't guarantee that later on there won't be another one added. ```java interface Thing { void exist(); } class OnlyThingImpl implements Thing { int specialThing = 8; @Override public void exist() {} } ... // This code works perfectly well Thing thing = ...; OnlyThingImpl thingImpl = (OnlyThingImpl) thing; System.out.println(thingImpl.specialThing); ... // but if an intern adds another implementation, java won't warn you about // the old usage and it will crash at runtime if you happen to call that code // with the wrong implementation. class OtherThingImpl implements Thing { @Override public void exist() { System.out.println("party"); } } ``` Even making the interface sealed doesn't solve this, since Java won't force you to handle other possibilities in your casts if another case is added to the sealed class ```java sealed interface Thing permits OnlyThingImpl { void exist(); } ... // This still has the same problem - you might forget about this code OnlyThingImpl thingImpl = (OnlyThingImpl) thing; ``` What can solve this is a "safe cast" done via a pattern matching switch. ```java OnlyThingImpl thingImpl = switch (thing) { case OnlyThingImpl __ -> __; }; ``` Mon, 24 Jan 0022 05:00:00 +0000Why you should care about Sealed Typeshttps://mccue.dev/pages/1-24-22-sealed-types As of Java 17 (with --enable-preview) you can combine two features - pattern matching and sealed types - to represent and deal with cases where two things have different sets of data. Say you want to represent three different kinds of items you can check out from a library. A book, which has a title and author ```java record Book(String title, String author) {} ``` A CD, which has a runtime and genre ```java record CD(double runtime, String genre) {} ``` And a VHS tape, which contains precious memories ```java record VHSTape(List<Memory> preciousMemories) {} ``` What you can do to have a method which might return any one of these cases, or to have a list of any item in the library, is make a common interface that all three of them share ```java interface LibraryItem {} record Book(String title, String author) implements LibraryItem {} record CD(double runtime, String genre) implements LibraryItem {} record VHSTape(List<Memory> preciousMemories) implements LibraryItem {} ``` The problem is that if you have a LibraryItem object, there isn't much you can actually do with it since the three cases of library items have different kinds of data and thus you cant access everything from the interface ```java LibraryItem item = ...; item.title(); // nope item.runtime(); // also nope item.preciousMemories(); // you'll never get them back ``` To remedy this, you can make the interface sealed. This guarantees to the language that Book, CD, and VHSTape are the only classes which implement LibraryItem ```java sealed interface LibraryItem permits Book, CD, VHSTape {} ``` Now if you have an instance of library item, you can use it with a "pattern switch expression" to safely recover the actual type of the item ```java LibraryItem item = ...; switch (item) { case Book book -> System.out.println(book.title()); case CD cd -> System.out.println(cd.genre()); case VHSTape tape -> System.out.println(tape.preciousMemories()); } ``` Mon, 24 Jan 0022 05:00:00 +0000Code generation with annotation processorshttps://mccue.dev/pages/1-23-22-code-generation Java does not allow annotation processors to affect the source or bytecode of classes you have written, only generate new classes. In Java 17, there is a clever way around this restriction though. Lets say you want to make an annotation processor that adds a `toJson` method to a class based on some automated set of rules. ```java @AutoToJson public record BasicThing(String color, String size) {} ``` What you can do is generate an interface with a predictable name ```java interface BasicThingToJson {} ``` Make it "sealed", so that only the class you want to can implement it ```java sealed interface BasicThingToJson permits BasicThing {} ``` And then inside of the interface you can add a default method, in which it is safe to assume that the only possible class that implements the interface is the class you want to ```java sealed interface BasicThingToJson permits BasicThing { default JSON toJson() { // totally safe var self = (BasicThing) this; var jsonObject = new JsonObject(); jsonObject.set("color", self.color()); jsonObject.set("size", self.size()); return jsonObject; } } ``` And then all your user has to do is 1. Annotate their class 2. Implement the interface that will be generated (its kinda circular, but it works out) ```java @AutoToJson public record BasicThing(String color, String size) implements BasicThingToJson {} ``` And then boom, their class has been "enriched" in whatever way you want. ```java var basicThing = new BasicThing("red", "small"); var json = basicThing.toJson(); ``` Sun, 23 Jan 0022 05:00:00 +0000Basics of Annotation Processorshttps://mccue.dev/pages/1-23-22-annotation-processor If you register an implementation of `javax.annotation.processing.Processor` with the ServiceLoader mechanism then you can make what is called an "Annotation Processor". Annotations are just metadata that can be put onto classes, fields, etc. You can declare your own annotation like this ```java package some.pack; @Target({ElementType.TYPE}) @Retention(RetentionPolicy.SOURCE) public @interface YourAnnotation { } ``` In this example the annotation can only be used on "types" like classes or interfaces and the metadata is only available on the source code. So we can use that example annotation to mark a class of ours ```java @YourAnnotation class SomeClass {} ``` And any annotation processors that the ServiceLoader can find at compile time can do stuff like generate brand new code, or add new compile time checks. ```java @SupportedAnnotationTypes("some.pack.MagicBean") @SupportedSourceVersion(SourceVersion.RELEASE_17) public final class AnnotationProcessor extends AbstractProcessor { @Override public boolean process( Set<? extends TypeElement> annotations, RoundEnvironment roundEnv ) { var filer = this.processingEnv.getFiler(); var elements = roundEnv.getElementsAnnotatedWith(YourAnnotation.class); try { var file = filer.createSourceFile("brand.new.Code"); try (var writer = file.openWriter()) { // there are libraries like javapoet which make doing this easier writer.append(""" package brand.new; class Code { public static final int NUMBER_OF_INSTANCES = %s; } """.formatted(elements.size())); } catch (IOException e) { throw new RuntimeException(e); } } return true; } } ``` Which in this case we generate a new class which has the number of annotated classes as a constant. Sun, 23 Jan 0022 05:00:00 +0000The Service Loader Mechanismhttps://mccue.dev/pages/1-20-22-service-provider-interface If you have an interface or abstract class defined in some jar ```java package whatever.project; public interface DoesThing { void doThing(); } ``` And in another jar you have one or more implementations of that interface which has a zero argument constructor ```java package something.other; public final class DoesThingImpl implements DoesThing { @Override public void doThing() { System.out.println("I implemented this in a certain way"); } } ``` as well as a file in that jar with the interface name under `META-INF/services` ``` META-INF/services/whatever.project,DoesThing ``` which has a line that has the name of the implementing class ``` something.other.DoesThingImpl ``` Then you can obtain an implementation of that interface via the service loader mechanism ```java var loader = ServiceLoader.load(DoesThing.class); for (var thingDoer : loader) { thingDoer.doThing(); } ``` This is some really core magic and is how most of the projects that want you to just add dependencies to get functionality - like slf4j, jdbc, twelvemonkeys, etc - use to do their thing Thu, 20 Jan 0022 05:00:00 +0000Factories in FPhttps://mccue.dev/pages/1-17-22-factories-in-fp ## Question from Muhammad Hamza Chippa > >How would you replace Factory design patterns in functional programming? Before someone else says it - with a function ```java interface ThingFactory { Thing makeThing(); } ... void work(ThingFactory factory) { var thing = factory.makeThing(); thing.whatever(); } ``` ```clojure (defn work [factory] (let [thing (factory)] (.whatever thing))) ``` You can also use the exact pattern as is (where the producer method gets a special name) with traits/typeclasses/protocols depending on your FP language Mon, 17 Jan 0022 05:00:00 +0000Aliasing core functions in Clojurehttps://mccue.dev/pages/1-1-22-aliasing-core-functions ## Question from rmxm > > Funny question, how do you cover for "get" word being a function name, say you have namespace 'computer, surely it would be intuitive to have "get" function obtaining "computer". Just wondering how do you overcome this linguitically 🙂 or do you just place get at the bottom You are allowed to have a function named get - you just need to put `(:refer-clojure :exclude [get])` in your namespace declaration to avoid warnings. And then you refer to the core `get` as `clojure.core/get` within that file. so ```clojure (ns some.computer (:refer-clojure :exclude [get])) (defn get [o] (clojure.core/get o :thing)) ``` or ```clojure (ns some.computer (:require [clojure.core :as core]) (:refer-clojure :exclude [get])) (defn get [o] (core/get o :thing)) ``` > sure, sure its not a technical question > more of, do you use a different function name, or do you choose another word etc. I do this with `update` in some contexts. No I just use get if I want to - but you have to be more specific about the use case for me to give a better name. You can usually call it `get-thing` or `thing` > If you mention both get and update in this case I will assume this is ok practice its fine, depends on your domain > sure, sure, thanks 🙂 > > I could also `get*` or something like that for, "c/grud" stuff, `get*, update*, delete*, create*` seems, ok I think if the intended usage of the namespace is `:require [a.b.thing :as thing], thing/func` and the ns is small and focused (enough that you wont forget you shadowed a core fn) then don’t worry about it > thanks, going the get* route, this less bother down the road i think > Sat, 01 Jan 0022 05:00:00 +0000How can you write this productSum algorithm written in JS in Java?https://mccue.dev/pages/10-15-21-product-sum-algorithm ## Question from u/warrior242 > I want to write this algorithm from JS to Java and I am having some problem doing it because Java does not support parameter defaults and does not do dynamic inputs. I know this must be somehow possible and even better written in a typed language but I dont know how. Please help Java Gods. > > The algorithm is supposed to go through the array, and add it up and multiply each part by the depth that its at. > > So something like `[1, 2]` > > would be something like > > `(1 * 1) + (2 * 1)`. > > Something at depth `[1, 2, [3, 4] ]` > > would be: > > (1 * 1) + (2 * 1). + (( 3 + 4) *2) > > As far as I understand anyways > > Link for JS version of algorithm: > ```javascript > class ProductSum{ > constructor() { > > } > > solution(array, multiplier = 1) { > let sum = 0; > > for (let i = 0; i < array.length; i++) { > if (array[i] instanceof Array) { > sum += this.solution (array[i], multiplier + 1) > } > > else { > sum = sum + array[i]; > } > } > return sum * multiplier; > } > } > > const myArray = [5, 2, [7, -1], 3, [6, [-13, 8], 4]]; > > let mySolution = new ProductSum(); > let result = mySolution.solution(myArray); > > console.log(result); > ``` So for java, first you need to represent this "type" of data in some way. You have a List of either single numbers or other Lists containing the same. You can represent this like so ```java sealed interface NestedNumberList { record SingleNumber(int x) implements NestedNumberList {} record MultipleNumbers(List<NestedNumberList> numbers) implements NestedNumberList {} } ``` Which you should read as a `NestedNumberList` is one of `SingleNumber` or `MultipleNumbers`. which makes your example data look like this ``` [5, 2, [7, -1], 3, [6, [-13, 8], 4]] ``` ```java var myArray = new MultipleNumbers(List.of( new SingleNumber(5), new SingleNumber(2), new MultipleNumbers(List.of( new SingleNumber(7), new SingleNumber(-1) )), new SingleNumber(3), new MultipleNumbers(List.of( new SingleNumber(6), new MultipleNumbers(List.of( new SingleNumber(-13), new SingleNumber(8) )), new SingleNumber(4); )); )); ``` Which yes, is way more verbose, but lets not dwell on that. This is a bit of a pathological case. ```java class ProductSum { int solution(MultipleNumbers array, int multiplier) { int sum = 0; for (var num : array.numbers()) { switch (num) { case SingleNumber single -> { sum = sum + single.x(); } case MultipleNumbers multiple -> { sum += this.solutionHelper(multiple, multiplier + 1); } } } return sum * multiplier; } int solution(MultipleNumbers array) { return this.solution(array, 1); } } ``` ```java System.out.println(new ProductSum().solution(myArray)); ``` So thats roughly conceptually equivalent. (for Java 17 with preview features). For the default multiplier value, in this case we emulate that with method overloading.Fri, 15 Oct 0021 05:00:00 +0000Smuggling Checked Exceptions with Sealed Interfaceshttps://mccue.dev/pages/11-1-21-smuggling-checked-exceptions ## Vanilla Code ```java import java.sql.Connection; import java.sql.SQLException; import java.util.ArrayList; import java.util.List; import java.util.Optional; record User(String name) {} public class VanillaCode { public static Optional<User> lookupUser(Connection db, int id) throws SQLException { var statement = db.prepareStatement("SELECT name FROM USER where id = ?"); statement.setInt(1, id); var resultSet = statement.executeQuery(); if (resultSet.next()) { return Optional.of(new User(resultSet.getString(1))); } else { return Optional.empty(); } } public static List<Optional<User>> lookupMultipleUsers(Connection db, List<Integer> ids) throws SQLException { List<Optional<User>> users = new ArrayList<>(); for (int id : ids) { users.add(lookupUser(db, id)); } return users; } public static void exampleUsage(Connection db) { try { lookupUser(db, 123).ifPresentOrElse( user -> System.out.println("FOUND USER: " + user), () -> System.out.println("NO SUCH USER") ); } catch (SQLException sqlException) { System.out.println("ERROR RUNNING QUERY: " + sqlException); } } } ``` ## Sealed Types ```java import java.sql.Connection; import java.sql.SQLException; import java.util.List; record User(String name) {} sealed interface UserLookupResult { record FoundUser(User user) implements UserLookupResult {} record NoSuchUser() implements UserLookupResult {} record ErrorRunningQuery(SQLException sqlException) implements UserLookupResult {} } public class SealedTypes { public static UserLookupResult lookupUser(Connection db, int id) { try { var statement = db.prepareStatement("SELECT name FROM USER where id = ?"); statement.setInt(1, id); var resultSet = statement.executeQuery(); if (resultSet.next()) { return new UserLookupResult.FoundUser(new User(resultSet.getString(1))); } else { return new UserLookupResult.NoSuchUser(); } } catch (SQLException e) { return new UserLookupResult.ErrorRunningQuery(e); } } public static List<UserLookupResult> lookupMultipleUsers(Connection db, List<Integer> ids) { return ids .stream() .map(id -> lookupUser(db, id)) .toList(); } public static void exampleUsage(Connection db) { switch (lookupUser(db, 123)) { case UserLookupResult.FoundUser foundUser -> System.out.println("FOUND USER: " + foundUser.user()); case UserLookupResult.NoSuchUser __ -> System.out.println("NO SUCH USER"); case UserLookupResult.ErrorRunningQuery errorRunningQuery -> System.out.println("ERROR RUNNING QUERY: " + errorRunningQuery.sqlException()); } } } ``` --- Sealed interfaces let you properly represent "sum types". This thing is either "A" or "B". In this case we have a `Stream<Integer>` that we want to turn into a `Stream<User>`, but our method that takes `Integer -> User` throws a `SQLException`. Your options before sealed interfaces were to 1. Make it a `Stream<User>` by re-throwing any `SQLException` as a `RuntimeException`. As other comments have mentioned, this can be an issue depending on your type of stream. Also it requires that you fail the whole procedure if getting any one `User` fails 2. Make it a `Stream<Object>` by returning any `SQLException` as a value, then cast it back at the end with `instanceof` checks. 3. Make it a `Stream<Try<User>>` like with `vavr`. This works if you don't care about the exception and just care that it failed in some way, but you won't be able to call methods particular to `SQLException` like `getSQLState` without casting. Also you lose documentation of why something can fail. 4. Do this same technique, but without a sealed interface. Doing this without needing to have `default -> ... error ...` branches on all your switches would require using the visitor pattern. The new option is what you see, you can properly represent a function from `Integer` -> `User | SQLException` by wrapping each of the values in a sealed hierarchy. ``` sealed interface UserLookupResult { record FoundUser(User user) implements UserLookupResult {} record ErrorRunningQuery(SQLException sqlException) implements UserLookupResult {} } ``` So it becomes `Stream<Integer>` -> `Stream<UserLookupResult>`, which is effectively `Stream<FoundUser | ErrorRunningQuery>`. This gives you the most flexibility in how to interpret the result of the Stream without any casting or assumptions in usage. Also if you had another possibility like `UserIsBanned`, you can add it to the sealed hierarchy and all our switches would force you to fix them. Fri, 01 Oct 0021 05:00:00 +0000Static Dependency Injection with Intersection Typeshttps://mccue.dev/pages/9-6-21-static-dependency-injection One of the patterns I am personally partial to that I haven't really seen get traction or attention in Java is to do DI manually. Imagine a hypothetical web framework. ```java interface Handler<Context> { Response handle(Context context, Request request) } ... final class Router<Context> { ... Router(Context context) { ... } void addHandler(String route, Handler<Context> handler) { ... } } ``` We can make a router that carries through some context to the different handlers ```java List<String> ips = Collections.synchronizedList(new ArrayList<>()); Router<List<String>> router = new Router<>(ips); router.addHandler("/hello", (ips, req) -> { ips.add(req.ip()); return Response.of(ips.toString()); }); ``` And then all our route handlers will get the context. Then the trick is to make an interface for each stateful "thing" a handler might want access to, like a database connection or a redis connection. ```java interface HasDB { DB db(); } interface HasRedis { Redis redis(); } ``` And similarly for anything that you might have that is "derivative" of those root stateful components ```java interface UserService { User findById(int id); } final class UserServiceImpl implements UserService { ... UserServiceImpl(DB db) { ... } User findById(int id) { ... } } interface HasUserService { UserService userService(); } ``` And make the "Context" be a type that implements all of those interfaces ```java record System(DB db, Redis redis) implements HasDB, HasRedis, HasUserService { @Override public UserService userService() { return new UserServiceImpl(db); } } ``` Then a handler just needs to "declare its dependencies" by saying which stateful components it wants to use. For example if we have a handler that just wants to lookup a user and write things into redis ```java public static <Components extends HasRedis & HasUserService> handleRequest( Components components, Request request ) { var redis = components.redis(); var userService = components.userService(); ... } ... System system = new System(...); Router<System> router = new Router<>(system); router.addHandler("/hello", Handlers::handleRequest); ``` And by virtue of `System` being `HasDB`, `HasRedis` and a `HasUserService` it will fulfill `HasRedis & HasUserService`. Replace "route handler" with whatever other entry points your app has and boom, dependency injection without reflection or magics. There are downsides - `System` might get fairly large depending on your preferences, it doesn't solve the problem of starting everything in the right order, and there is a decent amount of boilerplate - but I just wish more people knew about this "System pattern." Mon, 06 Sep 0021 05:00:00 +0000Handling numbers too big for base data typeshttps://mccue.dev/pages/8-18-21-how-to-handle-big-numbers ## Question from Skip#4185 > > How do you handle numbers that are too big to fit into the largest available data types? > > Would you define a struct containing multiple large data types and somehow split up the number? > > I'm not trying to do something like that, but i'm just curious if anyone knows what the right course of action would be in such a case End of day you should use a bigint library, but you can also "build" it yourself if you are so inclined ```c #include <stdio.h> #include <stdlib.h> enum bigint_kind { _64_BIT, _MORE_THAN_64_BIT }; union bigint_value { long u64; struct { long* blocks_of_64_bit; size_t length; } bigger_than_u64; }; struct bigint { enum bigint_kind _kind; union bigint_value _value; }; struct bigint* bigint_from_long(long value) { struct bigint* i = calloc(0, sizeof(struct bigint)); i->_kind = _64_BIT; union bigint_value value_; value_.u64 = value; i->_value = value_; return i; } void bigint_free(struct bigint* i) { if (i->_kind == _MORE_THAN_64_BIT) { free(i->_value.bigger_than_u64.blocks_of_64_bit); } free(i) } ``` If you are using rote C it can be a bit of a nightmare to track memory now for darn numbers since any of them can now contain a heap allocated array and in C you don't have "move semantics" like c++ so you end up doing unneeded heap allocations to track stuff and also there is some pointer indirection here which is annoying. TL;DR; use a library for it python, java, etc it should "just work". python's numerics auto promote and java has BigInteger. But the only languages on your profile are C and python. so if you are asking the question... Exercise for the reader ```c struct bigint* bigint_add(struct bigint* a, struct bigint* b) { if (a->_kind == _64_BIT) { if (b->_kind == _64_BIT) { } else { } } else { } } ``` Wed, 18 Aug 0021 05:00:00 +0000Basic C++ value classhttps://mccue.dev/pages/8-18-21-basic-cpp-value-class A class in C++ is the same thing as a struct, just the `clas`s keyword makes the fields private by default instead of public by default with struct. Here is a complete example of a simple "money" class ```cplusplus #pragma once #include <cstdint> #include <functional> #include <ostream> #include <optional> namespace shopping { class Money final { private: constexpr explicit Money(std::uint32_t cents) noexcept: cents(cents) {}; const std::uint32_t cents; public: constexpr static Money fromCents(std::uint32_t cents) noexcept { return Money(cents); }; constexpr bool operator==(const Money& other) const noexcept { return this->cents == other.cents; }; [[nodiscard]] constexpr std::uint32_t getCents() const noexcept { return this->cents; }; friend std::ostream& operator<<(std::ostream &os, const Money &money) { os << "Money{cents=" << money.cents << "}"; return os; }; constexpr Money operator+(const Money& other) const noexcept { return Money::fromCents(this->getCents() + other.getCents()); }; constexpr std::optional<Money> operator-(const Money& other) const noexcept { if (other.getCents() > this->getCents()) { return std::nullopt; } else { return Money::fromCents(this->getCents() - other.getCents()); } }; }; } namespace std { template <> struct hash<shopping::Money> { size_t operator()(const shopping::Money& money) const noexcept { return hash<uint32_t>{}(money.getCents()); } }; } ``` This is the same as this rust ```rust #[derive(Debug, Eq, PartialEq, Hash)] struct Money(cents: u32); ``` Now maybe stop learning c++. Wed, 18 Aug 0021 05:00:00 +0000How to build C++ from a git repohttps://mccue.dev/pages/8-17-21-makefile-for-cc-files-from-git-repo ## Question from csstudentbruh#5797 > > How do I make a makefile in order to compile a .cc file that uses libraries from a git repo Okay so there are basically 2 questions there 1. what are the commands to run to compile the git repo and include it in your code 2. how to use a makefile to do it In general using a makefile is just copy pasting the commands you would have run by hand so lets focus on question 1. https://github.com/hzeller/rpi-rgb-led-matrix Looking at the library, it looks like they have a make file which can build it. So in broad strokes - build that library with something like `make all`, then build your code which references the header files in the library then link all the code together. c++ is a nightmare, I'm truly sorry. It has the worst "build story" of any modern language. <img src="./8-17-21-img.jpeg"></img> Tue, 17 Aug 0021 05:00:00 +0000Type Classes in Elmhttps://mccue.dev/pages/8-10-21-typeclasses-in-elm ## Conversation with somebody#0002 > > elm doesn't even have typeclasses apparently Yeah, but you can do the pattern the same way scala does, just manually > so how does show work in elm ```elm type alias ShowOps a = { show : a -> String } show : ShowOps a -> a -> String show ops a = ops.show a stringShowOps : ShowOps String stringShowOps = { show = identity } intShowOps : ShowOps Int intShowOps = { show = String.fromInt } listShowOps : ShowOps a -> ShowOps (List a) listShowOps elementShowOps = { show = \l -> "[ " ++ String.join ", " (List.map (show elementShowOps) l) ++ " ]" } x : String x = show (listShowOps intShowOps) [ 1, 2, 3 ] ``` Like this. It's literally the same as scala except there are no implicit parameters - at which point you start to realize that maybe its not that special a pattern. > well > > obviously. > > it's the implicit parameters that is the special part sure, but the point is if i want to take a parameter that is `Show`, I just need access to the functions > well... yes that's kinda how vtables work too? > Yep, it's all connected. You want "dynamic dispatch" in a strongly typed system with no interface polymorphism? Make a vtable, pass it around. Tue, 10 Aug 0021 05:00:00 +0000Celsius to Fahrenheit in Javahttps://mccue.dev/pages/8-8-21-c-to-f-in-java ## Question from Sahil#8151 > i want to convert celsius to farenhiet and vice versa i have got the conversion part down but i want to be able to do with only the letter so that if someone types F at the end it converst to C and if someone types C at the end of any number it converst to farenheit without asking the user to what they want it converted to ```java public final class Temperature { private final double c; private Temperature(double c) { this.c = c; } public static Temperature fromCelcius(double c) { return new Temperature(c); } public static Temperature fromFarenheight(double f) { return (f - 32) * (5.0 / 9.0); } public double celcius() { return this.c; } public double farenheight() { return (this.c * (9.0 / 5.0)) + 32; } } ``` Obviously you know how to do the math, but one technique is to just always store one temperature type, one internal representation, and do conversions as needed but directly, at the boundaries. Sun, 08 Aug 0021 05:00:00 +0000How to remove items using a streamhttps://mccue.dev/pages/8-4-21-remove-items-with-stream ## Question from Kevin#1322 > When you iterate over a map.keySet(), map.values() or map.entrySet() and you call the iterator.remove() method. How does it work that the key-value of the map is removed? It depends on the exact collection. Sometimes remove won't be implemented, so for most purposes a stream and a filter would be "cleaner" and less prone to running into potentially unsupported methods. > i see > > how do you remove items while using a stream? Like this ```java List<Integer> xs = List.of(1, 2, 3, 4); List<Integer> evens = xs.stream().filter(x -> x % 2 == 0).collect(Collectors.toList()); ``` And for a map you would use Collectors.toMap if you need a map at the end. So "filter" is the answer > I see instead of removing you just create a new collection Yeah, because a bunch of iterators don't support remove (like `ArrayList`'s would, but not what you get from `List.of(..)`) if you are programming to the `List` or `Map` interface instead of a particular implementation it's safer to do a filter. Wed, 04 Aug 0021 05:00:00 +0000How to wait for the value from onSuccesshttps://mccue.dev/pages/8-4-21-waiting-for-future-to-finish ## Question from Stefan.lnd#2296 > How do I make sure that java waits for the value from the onSucces method: https://pastebin.com/TR6ea2QZ ? (I cut out the irrelevant stuff) > > ```java > public float getMoney(Player player){ > Float[] money = new Float[1]; > > try { > //SOME CODE FOR DB > asyncMySQL.getMoneyAsnyc(statement, new Callback<Float>() { > > @Override > public Float onSucces(Float money) { > //This value should be returned!!! > money[0] = money; > } > > @Override > public void onFailure(Throwable cause) { > > } > }); > > } catch (SQLException e) { > e.printStackTrace(); > } > return money[0]; > } > ``` I want to call out that, long term, it;s going to be a better tech investment to use blocking sql because loom, but yes, you can return a Future<Float> ```java public class FutureExample { interface ExampleAsync { void onSuccess(float money); void onFailure(Throwable throwable); } private void someThing(ExampleAsync exampleAsync) {} public Future<Float> getMoney() { final var future = new CompletableFuture<Float>(); try { someThing(new ExampleAsync() { @Override public void onSuccess(float money) { future.complete(money); } @Override public void onFailure(Throwable throwable) { future.completeExceptionally(throwable); } }); } catch (Exception e) { future.completeExceptionally(e); } return future; } } ``` Wed, 04 Aug 0021 05:00:00 +0000How do I send a message to everyonehttps://mccue.dev/pages/8-3-21-sending-a-message-to-everyone ## Question from GummyJon#4984: > Hey! I'm currently trying to make a custom communication client between me and my friends, and I'm coming across an issue. I have a server running which creates a different thread per person, but I want it to run a block of code along all threads, so everybody gets the same message. How can this be done? > > For context, this is the code itself per thread > >```java >import java.net.*; >import java.io.*; > >public class Handler extends Thread{ > > private Socket socket = null; > > public Handler(Socket socket){ > super("Handler"); > this.socket = socket; > } > > public void run() { > try ( > PrintWriter out = new PrintWriter(socket.getOutputStream(), true); > BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream())) > ) { > String inputLine; > while ((inputLine = in.readLine()) != null) { > System.out.println(inputLine); > out.println(inputLine); > } > } catch ( > IOException e) { > System.out.println("Exception caught"); > } > } >} >``` Have each thread have a method of communication. In practice this can mean each that each thread as its own queue it reads off of and the "producer" writes the same message to the queue of each "consumer". [Here](https://www.baeldung.com/java-concurrent-queueshttps://www.baeldung.com/java-concurrent-queues) is a decent rundown of your options, but ArrayBlockingQueue is probably good enoughhere is a decent rundown of your options, but ArrayBlockingQueue is probably good enough, so: ```java record Message(String contents) {} import java.net.*; import java.io.*; import java.util.concurrent.ArrayBlockingQueue; public class Handler extends Thread{ private Socket socket = null; private final ArrayBlockingQueue<Message> queue; public Handler(ArrayBlockingQueue<Message> queue, Socket socket){ super("Handler"); this.queue = queue; this.socket = socket; } public void run() { while (true) { Message msg; try { msg = this.queue.take(); } catch (InterruptedException e) { throw new RuntimeException(e); } } } } ``` And then push things into the queue from another thread. Though please don't extend thread, just implement Runnable. ```java record Message(String contents) {} import java.net.*; import java.io.*; import java.util.concurrent.ArrayBlockingQueue; public class Handler implements Runnable { private Socket socket = null; private final ArrayBlockingQueue<Message> queue; public Handler(ArrayBlockingQueue<Message> queue, Socket socket){ super("Handler"); this.queue = queue; this.socket = socket; } @Override public void run() { while (true) { Message msg; try { msg = this.queue.take(); } catch (InterruptedException e) { throw new RuntimeException(e); } } } } ``` Tue, 03 Aug 0021 05:00:00 +0000What does this regex mean...??https://mccue.dev/pages/8-3-21-what-does-this-regex-group-mean ## Question from sobiter#2949: > What does this regex mean...?? > > ``` > (?!.*\1) > ``` > > Anything that isn't group 1?? It means you should go outside hug a stranger smell a flower Tue, 03 Aug 0021 05:00:00 +0000How to do no more than 3 deletes a monthhttps://mccue.dev/pages/8-2-21-no-more-than-2-deletes-a-month ## Question from ab_al#8947: > I have a method that will check number of times user delete their tasks. Requirement includes no more than 3 deletes in a month. Can someone recommend a library to achieve this task? You need to record this information somewhere. The "simple" solution is to record the deletions in a database and block future ones if they are up to 3 in the last month. So the db can store the record of deletes and the times and then it's up to you to determine what a "month" is. Mon, 02 Aug 0021 05:00:00 +0000How to read a properties filehttps://mccue.dev/pages/8-2-21-extract-properties ## Question from Kiryo#9472 > i need to make that my app shows the specific value in xml file should i use scanner? or there is another way > i need to extract > ``` > # Rate Exp Sp > RateXp = 4. > RateSp = 4. > ``` > the value of for example this^ > > actually not the xml but .properities Get your .properties file as an input stream, then ```java final var properties = new Properties(); try (InputStream is = ...) { properties.load(is); } ``` Easy peasy, then just ```java properties.getProperty("RateXp"); ``` Full example: ```java public record Config(int rateXp, int rateSp) { public static Config loadFromClasspath() { final var properties = new Properties(); try (final var inputStream = getClass().getClassLoader().getResourceAsStream("config.properties")) { properties.load(inputStream); } catch (IOException e) { throw new UncheckedIOException(e); } return new Config( Integer.parseInt(properties.getProperty("RateXp")), Integer.parseInt(properties.getProperty("RateSp")) ); } } ``` Mon, 02 Aug 0021 05:00:00 +0000How to have a HashMap with different types as valueshttps://mccue.dev/pages/8-1-21-hashmap-with-different-value-types ## Question from FellowTomato#4643: > ```java > public static HashMap<String, Integer> seats = new HashMap<>(){{ > put("total_rows", 9); > put("total_columns", 9); > put("available_seats", new HashMap<String, Integer>(){{ > put("row", 1); > put("column", 1); }}); > }}; >``` > >How should I initialize HashMap to allow several datatypes as values? > >I want something like this in result: > >```json >{ > "total_rows":5, > "total_columns":6, > "available_seats":[ > { > "row":1, > "column":1 > }, >...]; >``` > > Turns out, I can simply type `Object`... You usually want to make objects for parsing json, not Map<Object,Object> > Sorry, I'm really new to Java. Can you provide an example? Okay so for what you have there ```json { "row":1, "column":1 }, ``` when we put this into Java we want to map it to a class like so ```java public record SeatPosition(int row, int column) {} ``` and then for the whole structure ```java public record Theater(int totalRows, int totalColumns, List<SeatPosition> availableSeats) {} ``` > Ah, you mean make a special class for a second HashMap? Only use a hash map if every key maps to the same thing and you don't know the set of keys ahead of time. If you know the set of things `"total_rows"`, `"available_seats"` - that is your clue you should be representing things as classes. records are a good default for this kind of thing. They are just a shorthand way of making a class that contains data and has methods for accessing it. Sun, 01 Aug 0021 05:00:00 +0000How do I extract the time from a Java Date objecthttps://mccue.dev/pages/8-1-21-extract-time-from-java-date-object ## Question from Meeks#7478: > how can i extract the time (i.e HH:mm:ss) from a java date object? Take your date ```java Date d = new Date(); ``` turn it into an Instant ```java Date d = new Date(); Instant instant = d.toInstant(); ``` make a date time formatter ```java Date d = new Date(); Instant instant = d.toInstant(); DateTimeFormatter formatter = DateTimeFormatter.ofPattern("HH:mm:ss"); ``` interpret your instant in a time zone ```java Date d = new Date(); Instant instant = d.toInstant(); DateTimeFormatter formatter = DateTimeFormatter.ofPattern("HH:mm:ss"); ZonedDateTime time = instant.atZone(ZoneId.of("EST")); ``` then format that with the formatter ```java Date d = new Date(); Instant instant = d.toInstant(); DateTimeFormatter formatter = DateTimeFormatter.ofPattern("HH:mm:ss"); ZonedDateTime time = instant.atZone(ZoneId.of("EST")); String dateString = formatter.format(time); ``` and then just don't have a Date anymore ```java Instant instant = Instant.now(); DateTimeFormatter formatter = DateTimeFormatter.ofPattern("HH:mm:ss"); ZonedDateTime time = instant.atZone(ZoneId.of("EST")); String dateString = formatter.format(time); ``` > will the time still be considered a date object > > cause my sql column is a date time type Is it a java.sql.Date or java.sql.TimeStamp? https://docs.oracle.com/javase/8/docs/api/java/sql/Timestamp.html#toInstant-- https://docs.oracle.com/javase/8/docs/api/java/util/Date.html#toInstant-- Both have a `toInstant` method which is how you can enter the realm of sane date types. Both also have a `from(Instant)` (`java.sql.Date` extends `java.util.Date`). > so when im saving to db an instant will be able to be inserted in? Depends on your driver, but most likely when you insert you will need to convert `Date.from(instant)`. > i see > > lemme try > > thanks man Sun, 01 Aug 0021 05:00:00 +0000How do I stop a threadhttps://mccue.dev/pages/8-1-21-how-to-stop-a-thread ## Question from Arbee#3030: > How do i stop a thread? <img src="/pages/8-1-21-you-dont.png"></img> > And how do I stop my bot then? lol You can interrupt the thread, or cancel the task and then it is up to your code to check the Thread.isInterrupted flag. Some built in methods which throw interrupted exceptions will do that check for you (like those on `HttpClient`) but without shutting down you cannot force a thread to stop. > Im kinda lost rn... how do I stop my bot without stopping the thread it is running in? well - you can communicate to the bot's thread > And how do I do that? Sharing a reference to an `AtomicBoolean` would do the trick. So if your bot has an `AtomicBoolean` it can check it occasionally to see if the value is true. If it's true, it can clean itself up - and atomic boolean is safe to change from another thread. But it might be difficult depending on how your bot framework is built Sun, 01 Aug 0021 05:00:00 +0000Are Functions called Methods in Javahttps://mccue.dev/pages/7-31-21-are-java-methods-functions ## Question from Em.#0694 > Hi > > Are functions called methods in Java? Yes, but the caveat is that you don't have true "free standing" functions. so in python ```python def f(x, y): return x + y ``` This is a function, it is its own thing. In Java "methods" have to "belong" to something either instances of a class or to the class itself. > Oh > > yeah > > that makes sense ```java class Thing { static int f(int x, int y) { return x + y; } } ``` so this `f` function in python, translated to java has to "belong" to some class as a "static method". Sat, 31 Jul 0021 05:00:00 +0000What would define FP in Java?https://mccue.dev/pages/7-28-21-what-would-define-fp-in-java ## Question from Fast Q#2816 > What would define FP in java? I usually hear it thrown around when talking about streams/ functional interfaces Let me write out some examples let me write out some examples ```java public final class BankAccount { private int balance; public BankAccount() { this.balance = 0; } public void deposit(int amt) { if (amt < 0) { throw new IllegalArgumentException("amt must be non-negative"); } else { this.balance += amt; } } public int withdraw(int amt) { if (amt < 0) { throw new IllegalArgumentException("amt must be non-negative"); } else { if (this.balance < amt) { int balance = this.balance; this.balance = 0; return balance; } else { this.balance -= amt; return amt; } } } public int getBalance() { return this.balance; } } ``` Okay real basic class here i just whipped up. Take a moment to read it and show me what an example usage might be. (this isn't FP, this is normal) (i'm also sure there is a bug, but ignore it) > So an example like: > > ```java > BankAccount myTaxHaven = new BankAccount(); > myTaxHaven.deposit(500); > myTaxHaven.withdraw(250); > ``` > ? sure. Mutable bank account - you can take money in and out. Now here is the challenge. We have a new requirement. We want to know the state of every bank account at every time. So somewhere in the code there is a map of user id to a map of timestamp to bank account. (does that parse?) > transaction timestamp? yeah > yep I'm tracking But in order to support that use case we can't change bank accounts directly like that. Or rather, one way to support it is to make the bank account immutable. > Wait, why can't we change bank accounts directly? lets say we had this ```java Map<Instant, BankAccount> bankAccountAtTime = new HashMap<>(); BankAccount myTaxHaven = new BankAccount(); bankAccountAtTime.put(Instant.now(), myTaxHaven); myTaxHaven.deposit(400); // oh no, our history is messed up ``` > Why is our history messed up? > > so the fact that we mutate our bank account after depositing? That makes me question the original claim that this is a solution > > I mean yeah we're mapping to the bank accounts, not a specific balance of the bank account in that time My framing here is a bit messed up. I guess I'm saying if they were immutable then this kind of solution would work. > If they were immutable, deposit wouldn't even do the same thing though Thats correct. So how would we re-write the code such that we support all the same use cases but our contract doesn't have any mutating methods. > Keep a list of <Instant, Balance> or <Instant, Transaction>(which contains before/after information) > Or a map, but some data structure I mean just the BankAccount class. > deposit/withdraw could return a Balance or Transaction object, but then I think our paradigm doesn't even make sense ```java public final class BankAccount { private final int balance; public BankAccount() { this(0); } private BankAccount(int balance) { this.balance = balance; } public BankAccount deposit(int amt) { if (amt < 0) { throw new IllegalArgumentException("amt must be non-negative"); } else { return new BankAccount(this.balance - amt); } } record WithdrawalResult(BankAccount account, int withdrawn) {} public WithdrawalResult withdraw(int amt) { if (amt < 0) { throw new IllegalArgumentException("amt must be non-negative"); } else { if (this.balance < amt) { return new WithdrawalResult(new BankAccount(0), this.balance); } else { return new WithdrawalResult(new BankAccount(this.balance - amt), amt); } } } public int getBalance() { return this.balance; } } ``` A bank account is a full history - always. There is the current state of it and past states of it, but they are all equally real "accounts". > right, that's a fine way of viewing it yeah It's like numbers. Imagine if this worked ```java Integer i = 5; i.subtractOne(); // i is now 4 ``` the *value* of 5 should be independent from the *identity* you assign that value. If that isn't the case - for numbers - it feels super wierd > Is the main point that mutability is inherently dangerous? Kinda. there are a lot of downsides to it - it's a lot harder to multithread is a big one. And you can always *get it back* if you need to. > Yeah this makes sense > I definitely believe in keeping an immutable history for anything important Wed, 28 Jul 0021 05:00:00 +0000How does this cause deadlockhttps://mccue.dev/pages/7-19-21-how-does-this-cause-deadlock ## Question from blindspot23#4418 > ```java > public class Deadlock { > static class Friend { > private final String name; > public Friend(String name) { > this.name = name; > } > public String getName() { > return this.name; > } > public synchronized void bow(Friend bower) { > System.out.format("%s: %s" > + " has bowed to me!%n", > this.name, bower.getName()); > bower.bowBack(this); > } > public synchronized void bowBack(Friend bower) { > System.out.format("%s: %s" > + " has bowed back to me!%n", > this.name, bower.getName()); > } > } > > public static void main(String[] args) { > final Friend alphonse = > new Friend("Alphonse"); > final Friend gaston = > new Friend("Gaston"); > new Thread(new Runnable() { > public void run() { alphonse.bow(gaston); } > }).start(); > new Thread(new Runnable() { > public void run() { gaston.bow(alphonse); } > }).start(); > } > } > ``` > > This is the example given for deadlock in java docs. I presume that inside the bow method, the last line causes the deadlock. But how? We access the bowBack method using the object and not the thread. Then why do we get a deadlock. Those synchronized blocks basically make this ``` Thread 1: - lock gaston - lock alphonse - unlock alphonse - unlock gaston Thread 2: - lock alphonse - lock gaston - unlock gaston - unlock alphonse ``` now we can write those out ``` 1. lock gaston 1. lock alphonse 1. unlock alphonse 1. unlock gaston 2. lock alphonse 2. lock gaston 2. unlock gaston 2. unlock alphonse ``` and intermesh them like so actually - exercise for you, how can those get ordered in a way that causes deadlock > Because bowBack method invocation inside bow method tries to get the lock of gaston, while a thread that tries to get the lock on gaston is already blocked and queued? ``` 1. lock gaston 2. lock alphonse 1. lock alphonse 2. lock gaston 1. unlock alphonse 1. unlock gaston 2. unlock gaston 2. unlock alphonse ``` Basically this. I think your words are right. > Alright then, Thank you for the help. Mon, 19 Jul 0021 05:00:00 +0000Array basicshttps://mccue.dev/pages/7-16-21-array-basics ## Question from junk#1089 > the question ask for: > > user input a number, n then save into array. after that, display the numbers out according to sequence An array is a fixed size container of elements ```java int[] numbers = new int[10]; ``` so here I made an array of 10 integers. They will all start at 0, and I can set ```java numbers[5] = 123; ``` any number in the array and get ```java int i = numbers[2]; ``` any number in the array. But if I try to get a number outside of the bounds of the array (in this case 0-9) ```java int j = numbers[10]; ``` it will crash. so if a user inputs a number and you want to save that number into an array you need to 1. Know what "place" in the array you are at 2. Know whether you can expect a certain number of elements because an array, like I said, is fixed size > thank you so much Fri, 16 Jul 0021 05:00:00 +0000Why is my List code crashinghttps://mccue.dev/pages/7-10-21-caveats-with-arrays-as-list ## Question from Los Feliz#2763 > is there a way to convert List&lt;String&gt; to ArrayList&lt;Object&gt;, and vice versa? > > here's the full picture: > > I have a String arr[]. > > I convert it into List&lt;String&gt; so I can show its content on Android app. > > But then for performing filter for search purposes, I require an ArrayList of objects. > > So I wanna convert my List&lt;String&gt; into an ArrayList&lt;Object&gt;, because my adapter is set to accept ArrayList as its params. > > this is how i converted. > > ```java > String arr[]; > List&lt;String&gt; list1; > > list1 = Arrays.asList(getResources().getStringArray(R.array.string-array's name)); > ``` > > thought i know what i'm doing until i saw the app crash with what i have now. > ``` > Unknown bits set in runtime_flags: 0x8000 > E/libc: Access denied finding property &quot;ro.serialno&quot; > E/AndroidRuntime: FATAL EXCEPTION: main > > at java.util.AbstractList.remove(AbstractList.java:161) > at java.util.AbstractList$Itr.remove(AbstractList.java:374) > at java.util.AbstractList.removeRange(AbstractList.java:571) > at java.util.AbstractList.clear(AbstractList.java:234) > at com.example.cspeaks.MyAdapter$2.publishResults(MyAdapter.java:105) > at android.widget.Filter$ResultsHandler.handleMessage(Filter.java:284) > at android.os.Handler.dispatchMessage(Handler.java:107) > at android.os.Looper.loop(Looper.java:214) > at android.app.ActivityThread.main(ActivityThread.java:7356) > at java.lang.reflect.Method.invoke(Native Method) > at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:492) > at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:930) > ```. ohhh. wrap your `Arrays.asList(...)` with `new ArrayList<>(Arrays.asList(...))` that *might* help. IDK though that message is horrific. > OMGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG @emccue > YOU ARE THE MAN! > JESUS CHRIST! > THANK YOUUUUUUUUUUUUUUUUUUUUU Okay don't leave yet. I want you to understand why that helped. > <img src="/pages/7-10-21-all-ears.gif"></img> List is an interface that has methods that read from a list (like `.get`) and methods that alter a list (like `.remove`). An implementation of that interface like `ArrayList` supports all the methods. You can get elements, add elements, and remove elements. But there are a bunch of methods in the JDK that return lists that are unmodifiable. `Arrays.asList` is one of those, because the purpose of it is to be a list "view" onto an array. <img src="/pages/7-10-21-javadoc.png"></img> > ooooooo. So in other words, when we have Arryas.asList, if we want to make changes to the array, we must change it to arraylist That doesn't parse for me, Arrays are their own things. So if you have a `String[]`, that is a fixed size collection. It does not implement the List interface. `Arrays.asList(..)` is a way you can take a`String[]` and get a `List<String>`, but the `List` you get doesn't support adding or removing elements. If you want to add or remove elements from that list you need to copy its contents into a list that supports that. > OMG. i learned this hard way. Almost 20 hours. `ArrayList` is the most common one and a good default. `List.of(...)` and `.toList()` on a `Stream` also give you unmodifiable lists - though those ones don't support setting elements in place either. Its annoying, but you just need to be aware of it. If you are in doubt, copy it into an `ArrayList` and you know you can do everything. > yeah. lesson learned. from now on Im coppying every instance of List<> into an arraylist Thats one approach. In other contexts that can be called "defensive copying" - Its a generally good technique if you are working with mutable structures though overkill if you are the one making and working with all the instances (potentially). Sat, 10 Jul 0021 05:00:00 +0000Why do we throw exceptions?https://mccue.dev/pages/7-9-21-why-do-we-throw-an-exception ## Question from blindspot23#4418 > Why do we throw an exception when try and catch can handle it? The use case is for "exceptional conditions". Like if you write a library that asks google for search results that can always fail because it goes over the network and you use try catch to handle it, but sometimes either 1. you can't handle it and you need to crash 2. you don't want to handle it and prefer to crash 3. It makes sense to "bubble" the exception up multiple layers Making it more complicated - there are actually 2 "kinds" of exceptions in Java, checked and unchecked ones. Checked exceptions like `IOException` you always have to handle either by doing some behavior or rethrowing. Unchecked exceptions like `RuntimeException` the language doesn't make you handle so generally those are used for situations where you don't think people will want to try to recover. It's all also really abstract until you try writing some code that uses or works with them. (and also really frustrating because not even the standard library does exceptions "right" always) Fri, 09 Jul 0021 05:00:00 +0000How do I break correctlyhttps://mccue.dev/pages/7-7-21-how-do-i-break-correctly ## Question from Quadzilla#9639 > ```java > public class Pyramid { > > public static void main(String[] args) { > > int s = 50; > while (true) { > s += 50; > if (s > 50) { > System.out.println(s); > } else { > s = 200; > break; > } > while (s >= 200) { > break; > } > } > } > } > ``` > > How can I add in a break to prevent the numbers from going to 20,000 in the terminal and beyond. Am I applying break correctly? I adapted the code from a java forum that printed a pyramid & thought today is the day to apply the break statement. s was changed by me to integer rather than string. So first ```java int s = 50; while (true) { s += 50; if (s > 50) { System.out.println(s); } else { s = 200; break; } while (s >= 200) { break; } } ``` This else block here where you set s to 200 will never run. `s` starts out at 50, you immediately add 50 to it, So it's always above 50 and you never set it to anything lower. ```java int s = 50; while (true) { s += 50; if (s > 50) { System.out.println(s); } while (s >= 200) { break; } } ``` Now for that last while block with the break. `break` is a "terminal" action. You will exit your loop if you do it. And because you nest that other loop you only break out of the `while (s >= 200)` loop, not the `while (true)` loop. So what you want isn't a `while`, it's an `if`. ```java int s = 50; while (true) { s += 50; if (s > 50) { System.out.println(s); } if (s >= 200) { break; } } ``` Wed, 07 Jul 0021 05:00:00 +0000How do I inject dependencies into Enumshttps://mccue.dev/pages/7-6-21-javax-inject-into-enums ## Question from Chem#9771 > In `javax`, if you use `@Inject` into a constructor, but you have instances where you can't pass those parameters, what do you do? > >```java >@ApplicationScoped >public class JavascriptLanguageExecutor extends LanguageExecutor { > @Inject > JavascriptLanguageExecutor(EngineInstance engine) { > super(Language.JAVASCRIPT, engine.instance); > } >} > >public enum Language { > JAVASCRIPT("js"); > > public LanguageExecutor getExecutor() { > return switch (this) { > // THIS NEEDS "EngineInstance" BUT I DON'T HAVE ABILITY TO "@Inject" INTO ENUMS > case JAVASCRIPT -> new JavascriptLanguageExecutor(); > }; > } >``` > >So here this `EngineInstance` is available via injection in the app but you can't use this inside of enums I think, and so I can't actually instantiate the class there. > > Could I move it out of the constructor and do a field injection In that case, you need to pass an EngineInstance. Sounds dumb, but yeah. Ignore all the javax requirements for a second, since they aren't super relevant. ```java public class JavascriptLanguageExecutor extends LanguageExecutor { JavascriptLanguageExecutor(EngineInstance engine) { super(Language.JAVASCRIPT, engine.instance); } } ``` What you want is, effectively, a map of Language -> Executor. So just make one. ```java var executors = new EnumMap<Language, LanguageExecutor>(...); ``` And then pass it around where you need it. There are more options but generally speaking Java is structured in a way where passing the thing is the path of least resistance. Unless you made your LanguageExecutors work with the service provider stuff, but that's somewhat niche (for application code). Im not gonna pretend you can't also have a static map and fill it up with instances - you can. Just then you need to be careful of when it is hydrated vs. uninitialized and make sure to synchronize if you need to etc. Tue, 06 Jul 0021 05:00:00 +0000How to execute code from a client side apphttps://mccue.dev/pages/7-4-21-how-to-run-code-from-the-client > Is it possible to write complex code (like instantiating an object and use the stream API) in string then parse it using something like spring language expression? Basically what I'm trying to do is to write code from the client side in string format then find a way to parse it and execute it on the backend side. From a business use case what I want to do is write some code to out put something and simply store it in a column field named "result". > > Someone replied to me before and said it's doable but I had to abandon the chat > If you abandon Java you can do this with something you can sandbox in the jvm, like possibly nashorn or my pref. - SCI. ```java String expressionStr = """ (let [customer-1 (Customer. "Mike" 9) customer-2 (Customer. "Ted" 20) customers. [customer-1 customer-2]] (->> customers (filter (fn [customer] (= "Mike" (.getName customer)))) (first))) """ ``` Then evaluate that in SCI. https://github.com/borkdude/sci ```java IFn eval = Clojure.read("sci.core", "eval-string"); Customer customer = (Customer) eval.invoke(expressionStr); ``` and to give it sandboxed access to your classes, probably do this ```java IFn symbol = Clojure.read("clojure.core", "symbol"); IFn keyword = Clojure.read("clojure.core", "keyword"); IFn eval = Clojure.read("sci.core", "eval-string"); Customer customer = (Customer) eval.invoke( expressionStr, Map.of(keyword.invoke("classes"), Map.of(symbol.invoke("Customer"), Customer.class)) ); ``` Full example if you have clojure and sci included in your project ```java import clojure.java.api.Clojure; import clojure.lang.IFn; import java.util.Map; public final class SciTest { private SciTest() {} private static final IFn REQUIRE; private static final IFn SYMBOL; private static final IFn KEYWORD; private static final IFn EVAL_STRING; static { REQUIRE = Clojure.var("clojure.core", "require"); SYMBOL = Clojure.var("clojure.core", "symbol"); KEYWORD = Clojure.var("clojure.core", "keyword"); REQUIRE.invoke(Clojure.read("sci.core")); EVAL_STRING = Clojure.var("sci.core", "eval-string"); } public record Thing(int x) {} public static void main(String[] args) { System.out.println(EVAL_STRING.invoke( """ (Thing. 10) """, Map.of( KEYWORD.invoke("classes"), Map.of( SYMBOL.invoke("Thing"), Thing.class ) ) )); } } ``` > Will take a look, thanks! Curious what the actual requirement here is though. Since this isn't exactly super duper safe to do even with the sandboxing. They can still OOM your machine or whatever. > Requirement isn't fully clear but basically when creating a new entity record, there's a "result" field where the end user wants to write some code or formula based on other fields of the entity, ex: result = id + title + getUniqueRandNubr() Yeah this might be a better use for a little dsl. You can make a tiny lang with antlr pretty easily. This is more general purpose, you can basically write any code. Sun, 04 Jul 0021 05:00:00 +0000How does this stack implementation workhttps://mccue.dev/pages/7-3-21-how-does-this-stack-work ## Question from whoopdoop#7311 > Im looking at this stack implementation and > > ```java > public class ArrayStackOfStrings implements Iterable<String> { > private String[] items; // holds the items > private int n; // number of items in stack > > public ArrayStackOfStrings(int capacity) { > items = new String[capacity]; > } > > public void push(String item) { > items[n++] = item; > } > > public String pop() { > return items[--n]; > } > ``` > > If I push an item using this push method, we are changing the items list, but how are we changing the variable n to update the new number of items? same with pop() n++ does the mutation > wait so n is being changed impliciftly? yes. `n++` means return the current value of n and also increment it. `--n` means decrement it and then return the new value. It's clearer to write it like this ```java public class ArrayStackOfStrings implements Iterable<String> { private String[] items; // holds the items private int n; // number of items in stack public ArrayStackOfStrings(int capacity) { items = new String[capacity]; } public void push(String item) { items[n] = item; n++; } public String pop() { n--; return items[n]; } ``` or ```java public class ArrayStackOfStrings implements Iterable<String> { private String[] items; // holds the items private int n; // number of items in stack public ArrayStackOfStrings(int capacity) { this.items = new String[capacity]; } public void push(String item) { this.items[this.n] = item; this.n++; } public String pop() { this.n--; return this.items[this.n]; } ``` so you see more clearly the order of things without remembering the difference between `n++` and `++n` > ohhh that makes so much sense > > theres no diff between n-- and --n? There is ```java int x = 5; int y = x++; // x is 6, y is 5 ``` ```java int x = 5; int y = ++x; // x is 6, y is 6 ``` But just don't get clever with it, do it on its own line Sat, 03 Jul 0021 05:00:00 +0000Do you write incremental operators in loops and conditions?https://mccue.dev/pages/7-3-21-should-i-use-incremental-operators ## Question from keldranase#4427 > The question is more about coding style for production grade code. Do you write incremental operators in loops and conditions? Consider two pieces of merge algo. What version is better? > ```java > if (list.get(left) < list.get(right)) { > result.set(writePtr, list.get(left)); > ++left; // separated incremetns > ++writePtr; > } else { > result.set(writePtr, list.get(right)); > ++right; > ++writePtr; > } > ``` > > ```java > if (list.get(left) < list.get(right)) { > result.set(writePtr++, list.get(left++)); // increments inside things > } else { > result.set(writePtr++, list.get(right++)); > } > ``` > > Another example is something like this: > > ```java > while (someCounter-- > 0) { > // do something > } > ``` > > Or more verbose, like this: > > ```java > while (someCounter > 0) { > // do something > --someCounter; > } > ``` Most increments are effectively just iterators. So if you are doing anything other than `(int i = 0; i < container.size(); i++)` then the best code quality would come from working with the iterator/stream abstractions. However the first one is the one I would go with, with the caveat that I would use `left++` instead of `++left`. No one remembers operator precedence order except C programmers, so separating mutation from assignment/passing of values is best. Everyone does `thing++` usually and, outside of a context where you are passing at the same time you are changing it, it doesn't matter. Sat, 03 Jul 0021 05:00:00 +0000How do I fix my MS-Paint clonehttps://mccue.dev/pages/7-3-21-dissapearing-lines-swing ## Question from CONNOR#1257 > Hello, I am trying to make an application similar to ms paint. > > When I add `super.paint(g)` to my paint method my panel appears, however, when I start drawing the lines keep disappearing. when I remove the `super.paint(g)` the panel is no longer visible but the drawing works perfectly fine. > > <img src="/pages/7-3-21-blank-swing.png"></img> Your issue is this: If you call `super.paint(g)` it will clear the screen. It then expects you to draw every point. If you don't call `super.paint(g)` then the panel isn't redrawn, but whatever paints you have from previous render cycles are left. The solution is to remember what has been drawn so far and just draw it on the screen again every time you re-render. The easiest way to do this might be to have a 2d array of `java.awt.Color`. When you draw update that array, and when you re-render just write every color to the screen. Sat, 03 Jul 0021 05:00:00 +0000What is the difference between arguments and parametershttps://mccue.dev/pages/7-2-21-difference-between-arguments-and-parameters ## Question from Aviral#7054 > Hey guys. Can anyone please take 5 mins of their time and just explain the difference between arguments and parameters? > > Some people say it's the same and I'm not really convinced... It's like magma and lava. Same thing, different name based on context. Close enough that only total nerds will correct you if you use them interchangeably Fri, 02 Jul 0021 05:00:00 +0000Are Hashtables not used anymore in Java?https://mccue.dev/pages/7-2-21-are-hashtables-not-used-in-java ## Question from asianmalaysian vietnamese#1514 > Are hashtables like not used anymore in java? Is its synchronicity for key value pairs not a beneficial thing? > > Oh if someone asks where did I hear hashtables not being used anymore, in a different group in a different social media platform They have been superseded. People should now (generally) be writing code that targets the `Map` interface. If you want a synchronized map you can use `Collections.synchronizedMap` and if you want a good concurrent map there is `ConcurrentHashMap`. That being said `Hashtable` isn't broken or anything. You can still use it if you want a synchronized map. It just comes from before the Java collections framework so the "default" we want to show people is `HashMap` and then how to synchronize whatever map they chose. So strictly speaking `Hashtable` is redundant. > got it, thanks! same is true for `Vector` vs `ArrayList` also like, <img src="/pages/7-2-21-jdoc-1.png"></img> `Hashtable` implements `Map` and extends `Dictionary` <img src="/pages/7-2-21-jdoc-2.png"></img> Dictionary is this wierd abstract class that is basically an interface and some of its methods like keys() are redundant with keySet() on Map and also return `Enumeration`. <img src="/pages/7-2-21-jdoc-3.png"></img> which is like iterator but without syntax support. So yeah, just generally more crufty. Fri, 02 Jul 0021 05:00:00 +0000Do I need new exceptions for every invalid fieldhttps://mccue.dev/pages/7-1-21-do-i-need-exceptions-for-every-field ## Question from huh#0893 > are exception classes used per class field or can one exception class handle every field? > > If my class has fields for > > ``` > String name, int ID, double salary, double hours > ``` > > If I need exceptions for handling empty name `""`, negative number ID, negative salary, negative hours, do I need a new exception class for every field? > > by exception class I mean something like this > > ```java > public class InvalidPayRate extends Exception > { > public InvalidPayRate(double p) > { > super("Hourly pay rate may not be negative or greater than 25: " + p); > } > } > ``` The reason you don't want a specific `InvalidPayRate` exception in this case is that it is an *unrecoverable* scenario. An Invalid pay rate will almost always mean programmer error. ```java public class InvalidPayRate extends Exception { public InvalidPayRate(double p) { super("Hourly pay rate may nor be negative or greater than 25: " + p); } } ``` if you were to throw this exception from somewhere you would need to declare that you throw it and then the caller would need to handle it. So at the very least you want this to extend `RuntimeException`. ```java public class InvalidPayRate extends RuntimeException { public InvalidPayRate(double p) { super("Hourly pay rate may nor be negative or greater than 25: " + p); } } ``` Since it's a programmer error you don't expect to catch. At that point you need to weigh the value of doing this for each field in a class and the value is just having the stack trace say "InvalidPayRate", which you can already put into the message. Thu, 01 Jul 0021 05:00:00 +0000How do I make Yoté in JavaFXhttps://mccue.dev/pages/6-28-21-how-to-make-yote-in-javafx ## Question from Bruno Machado#9013 > Hi everyone, I want to develop the Yoté game in java fx with scenebuilder and sockets, but I'm having trouble figuring out how to make each client's interface change every moment. How can I do this kind of communication with the server in javafx? thank you all Have a dedicated thread poll the socket for the current state on an interval and communicate that to your UI thread. I don't know the JavaFX specifics but the general concept is to constantly poll on a thread (or have the server push) and then update your model of the game accordingly. Mon, 28 Jun 0021 05:00:00 +0000Enums are a shorthandhttps://mccue.dev/pages/6-28-21-enums-are-just-a-shorthand ```java public enum StopLight { RED, GREEN, YELLOW; } ``` Enums are a shorthand for this pattern ```java public final class StopLight { public static final StopLight RED = new StopLight(); public static final StopLight GREEN = new StopLight(); public static final StopLight YELLOW = new StopLight(); private StopLight() {} } ``` They get more support by being a language feature just like records but this is it. So when you say make a singleton like this ```java public class PlayerManager { private static final PlayerManager INSTANCE = new PlayerManager(); public static PlayerManager getInstance() { return INSTANCE; } private PlayerManager() {} } ``` all I see is this ```java public enum PlayerManager { INSTANCE; public static PlayerManager getInstance() { return INSTANCE; } } ``` Mon, 28 Jun 0021 05:00:00 +0000Why is my python so uglyhttps://mccue.dev/pages/6-27-21-why-is-this-python-ugly ## Question from Leslie#7406 > ```python > FRR = map(lambda element: element[1], read_csv) > FAR = map(lambda element: element[2], read_csv) > ``` > why is this so ugly Use `operator.itemgetter(1)`, thats what it is for. Also a list comprehension is nicer usually. > make an example out of my code pls ```python FRR = map(operator.itemgetter(1), read_csv) ``` But then also ```python FRR = [ element[1] for element in read_csv ] ``` But then also ```python FRR, FAR = zip(*[ (element[1], element[2]) for element in read_csv ]) ``` > idk Ive never seen a zip(* [ before `*` is a splat. It unrolls an iterable into arguments. Sun, 27 Jun 0021 05:00:00 +0000How to handle passwordshttps://mccue.dev/pages/6-27-21-how-to-handle-passwords ## DO NOT Store passwords in plaintext. You should never have a record of what your user's passwords are in a way which you can read them. ## DO NOT Encrypt passwords https://nakedsecurity.sophos.com/2013/11/04/anatomy-of-a-password-disaster-adobes-giant-sized-cryptographic-blunder/ ## DO NOT Use a general purpose hash function like SHA or MD5 https://security.stackexchange.com/questions/90064/how-secure-are-sha256-salt-hashes-for-password-storage ## DO NOT "roll your own" implementation of cryptographic functions. You will get it wrong. https://security.stackexchange.com/questions/18197/why-shouldnt-we-roll-our-own ## DO Use a well tested implementation of PBKDF2, bcrypt, or scrypt to hash and salt passwords https://spring.io/projects/spring-security https://mvnrepository.com/artifact/org.springframework.security/spring-security-core ```java import org.springframework.security.crypto.password.PasswordEncoder import org.springframework.security.crypto.factory.PasswordEncoderFactories; public final class PasswordUtils { private PasswordUtils() {} private static final PasswordEncoder PASSWORD_ENCODER = // At time of writing the default implementation uses BCrypt PasswordEncoderFactories.createDelegatingPasswordEncoder(); // Store the result of this in your database public static String encode(String password) { return PASSWORD_ENCODER.encode(password); } // And use this to check if a user gave you the right password later public static boolean matches(String password, String encodedPassword) { return PASSWORD_ENCODER.matches(password, encodedPassword); } } ``` Sun, 27 Jun 0021 05:00:00 +0000How do I get this part of a Stringhttps://mccue.dev/pages/6-23-21-how-to-pull-out-text-from-a-string ## Question from RaiderRoss#6666 > hey > > ik this might sound stupid but i need help with some string manipulation > > ```java > int pos = e.getMessage().getContentRaw().lastIndexOf(" "); > ``` > > So this is my code, my String would look like this > ``` > ~ban @RaiderRoss This is the whole reason including spaces > ----------------------------------------- > ``` > how would i get the underlined part as a string > > so like the second index basically Once you have an index, you can trim the string from that point forward using substring. so ```java String messageRaw = e.getMessage().getContentRaw(); String reason = messageRaw.substring(messageRaw.lastIndexOf(" ")); ``` But in your case it seems like you would only get `spaces`. Instead, we can split on spaces. ```java String messageRaw = e.getMessage().getContentRaw(); String[] words = messageRaw.split(' '); ``` then join, skipping the first two > how do I join > > skipping the first two ```java String messageRaw = "a b c "; String[] words = messageRaw.split(" "); String reason = Arrays.stream(words) .skip(2) .collect(Collectors.joining(" ")); ``` > btw our teacher never thought us any of this I got to learn it all on my own so ty for the help This is one way. You can also do it manually or instead just find the index of the 2nd space and take a substring after that. https://stackoverflow.com/questions/19035893/finding-second-occurrence-of-a-substring-in-a-string-in-java/35155037 ```java String messageRaw = "a b c "; String reason = messageRaw.substring(messageRaw.indexOf(" ", messageRaw.indexOf(" ") + 1)); ``` and you can also use regex if you feel brave > ok i got it now ty sm for the help Wed, 23 Jun 0021 05:00:00 +0000Is it possible to use SCSS in a Java projecthttps://mccue.dev/pages/6-22-21-scss-in-a-java-project ## Question from Senhor#3353 > hi > > i have one question: is it possible to use SCSS in a Java project? I am currently generating PDF from HTML pages; i craft these pages with html and pure css, and i was wondering if there's a way to use SCSS instead Sure, but you are going to need to do some plumbing. SCSS is built in JS so there really isn't a way to include it as part of a java build system. You will need to install it with npm separately either globally, or in a project in the directory where you need to run your jar. Then you need to invoke scss to generate CSS files. You can do this through ProcessBuilder, so it is definitely doable - just not frictionless. Tue, 22 Jun 0021 05:00:00 +0000How to print data with proper spaceshttps://mccue.dev/pages/6-22-21-how-to-print-with-proper-spaces ## Question from ericmp#6201 > hi, im tring to print this but with proper spaces, to make it more visual: > > <img src="/pages/6-22-21-data.png"></img> > > ```java > System.out.printf("%-25s %-25s %-20s %-21s %-21s %-24s %-24s %-25s %-25s %-20s %-20s %-20s %-25s %n", "{ nom: " + nom , "tipus: " + tipusS , "número: " + numeroS , "valor: " + valorS , "pedres: " + pedresS , "tirada: " + tiradaS , "agafada: " + agafadaS , "triomf: " + triomfS , "última: " + ultimaS , "tirada per: " + tiradaPerS , "recollida per: " + recollidaPerS , "tirada primer: " + tiradaPrimerS , " }"); > ``` > > but as u see, is not really good, like, sometimes there is more spaces, and sometimes less, and finally, there isnt > > how could i do it better? > > (i tried to put 20s to each one, but looks worse) So this is actually something that really old programs needed to do a lot. And honestly most bank statements, etc. But it's weirdly not super supported in modern languages. If you don't care about how long it gets we can make a method that prints to a table so. ```java public static void outputAsTable(List<Map<String, String>> stuff) { } ``` Start with this - assume we have a list of maps you want to print out. We can scan through each map and find the longest value for each key. ```java public static void outputAsTable(List<Map<String, String>> stuff) { Map<String, Integer> longestValues = new HashMap<>(); for (var map : stuff) { for (var entry : map) { if (longestValues.getOrDefault(entry.getKey(), 0) < entry.getValue().length()) { longestValues.put(entry.getKey(), entry.getValue().length() } } } ... } ``` Then once you have all those you can do math to figure out how many spaces to add. > I think this will be hard, I've never worked with this maps, and this variables are not stored inside any array, they are just the attributes of a class, so I print them to check how all is working You just need to convert your objects to a list of maps, unfortunately. If you want it all properly spaced you need to print them all at once not individually since if you think about it, how much padding to give any one element can be determined by what you want to print 10 items later. Another option is to print vertically ``` field_1: ... field_abc: ... ww: ... ------------------- field_1: ... field_abc: ... ww: ... ``` Tue, 22 Jun 0021 05:00:00 +0000Why is the else not printinghttps://mccue.dev/pages/6-19-21-why-is-the-else-not-printing ## Question from akira💖🌻#6298 > hello i have a small small problem > ```java > package HelloJaxa; > import java.util.*; > public class Main { > public static void main(String[] args) { > Scanner scan=new Scanner(System.in); > System.out.print("Enter 10 Grades--> "); > int Number=0; > int counter90=0; > int counterless=0; > if (Number>=0 && Number<=100) { > for (int i = 0; i < 10; i++) { > Number = scan.nextInt(); > if (Number >= 90) > counter90++; > if (Number > 60 && Number < 90) > counterless++; > } > System.out.println("Grades that equal 90 or more-->"+counter90); > System.out.println("Grades that are between 60 and 90--> "+ counterless); > } > else > System.out.println("Error!"); > }} > ``` > > why is the else not printing when i input a number thats <0 or >100 First, put the `{}` around the else. Sometimes that can be the whole issue. Try not to omit the `{}` even though its technically allowed. And you will never reach the else since you check if the number is between 0 and 100 before entering the loop where you ask for input. You never return to that check again so it only runs once and sees 0. > what do i do now ? Move your check inside your for loop Sat, 19 Jun 0021 05:00:00 +0000Simple AtomicReference Wrapperhttps://mccue.dev/pages/6-19-21-atom Simple wrapper around an atomic reference to provide a basic functional interface for CAS operations. Useful if you have some immutable state you want to share between and mutate on multiple threads ```java package dev.mccue.async; import java.util.concurrent.atomic.AtomicReference; import java.util.function.Function; /** * Simple wrapper over an AtomicReference to provide an API for doing compare and swap operations. * * Modeled after the atom primitive in clojure. * @param <T> The type of data stored in the atom. This is assumed to be an immutable object. */ public final class Atom<T> { private final AtomicReference<T> ref; private Atom(T data) { this.ref = new AtomicReference<>(data); } /** * Creates an atom wrapping the given data. * @param data The data to be stored in the atom. * @param <T> The type of data stored in the atom. This is assumed to be an immutable object. * @return An atom containing the given data. */ public static <T> Atom<T> of(T data) { return new Atom<>(data); } /** * Swaps the current value in the atom for the value returned by the function. * @param f The function to apply to the current value. It is expected that this * will be a "pure" function and thus may be run multiple times. * @return The value in the Atom after the function is applied. */ public T swap(Function<? super T, ? extends T> f) { while (true) { final var start = ref.get(); final var res = f.apply(start); if (this.ref.compareAndSet(start, res)) { return res; } } } /** * Pair of the new value swapped into an atom and some value that was * derived in the course of calculating that new value. * @param <T> Type of the new value. * @param <R> Type of the derived value. */ public record SwapResult<T, R>(T newValue, R derivedValue) {} /** * Performs a swap that carries over some context from the computation to the caller. * * For example, a basic usage would be to return some whether a value was inserted into a map. * * <pre>{@code * sealed interface PlayerJoinResult permits AlreadyInGame, Success {} * record AlreadyInGame() implements PlayerJoinResult{} * record Success(String playerId) implements PlayerJoinResult {} * // ... * final var playerId = UUID.randomUUID().toString(); * final var gameAtom = Atom.of(Map.empty()); * final var swapResult = gameAtom.complexSwap(game -> { * if (game.contains(playerId)) { * return new ComplexSwapResult<>(game, new AlreadyInGame()); * } * else { * return new ComplexSwapResult<>(game.put(playerId, new Object()), new Success(playerId)); * } * }); * * if (swapResult.derivedValue() instanceof AlreadyInGame) { * return "Oh no!"; * } * else { * return "hooray"; * } * }</pre> * * @param f The function to apply to the current value. It is expected that this * will be a "pure" function and thus may be run multiple times. * @param <R> The type of the context attached to the final result. * @return A pair of the new value put into the atom and the derived value from the * computation of that new value. */ public <R> SwapResult<T, R> complexSwap( Function<? super T, SwapResult<? extends T, ? extends R>> f ) { while (true) { final var start = ref.get(); final var res = f.apply(start); if (this.ref.compareAndSet(start, res.newValue())) { return new SwapResult<>(res.newValue(), res.derivedValue()); } } } /** * Resets the value in the atom to the given value. * @param data The new value to be stored in the atom. * @return The new value stored in the atom. */ public T reset(T data) { this.ref.set(data); return data; } /** * @return The atom's current value. */ public T get() { return this.ref.get(); } @Override public String toString() { return "Atom[value=" + this.get() + "]"; } } ``` Sat, 19 Jun 0021 05:00:00 +0000How is abstraction achieved through abstract classes and interfaceshttps://mccue.dev/pages/6-8-21-how-is-abstraction-achieved ## Question from Data20#5839 > How is abstraction achieved through abstract classes and interfaces? > > I may have a misunderstanding somewhere, so I'll elaborate on where I'm at with the concept. > > From my understanding: Abstraction is a concept of OOP that focuses on hiding implementation detail and showing only what is necessary. The functional detail > > This is the example I see when I go with this definition: > > ```java > public class Base { > > public int area(int length, int width){ > return length * width; > } > } > > public class Main { > public static void main(String[] args) { > Base r1 = new Base(); > > System.out.println("The area of this rectangle is " + r1.area(5,6) + "."); > } > } > ``` > > When I called the area method in Base, I just wanted the area of a rectangle. I didn't need to know the actual formula for it, just the result of it. > > Having said that, I feel that I'm off somewhere and am confused with how abstraction is achieved with Abstract classes and Interfaces. Okay so first - abstraction is not a concept of OOP. Your area method does abstract the actual mechanism of computation. i.e. you could rewrite it like this ```java public int area(int length, int width){ if (width < 0) { return area(length, -1 * width) * -1; } else if (width == 0) { return 0; } else { return length + area(length, width - 1); } } ``` and its contract to the rest of the code would remain the same. What interfaces give you is a mechanism for *polymorphism*. For instance if I wrote code like this ```java public enum IntOrder { LessThan, EqualTo, GreaterThan; } public final class IntComparator { public IntOrder compare(int a, int b) { if (a < b) { return LessThan; } else if (a == b) { return EqualTo; } else { return GreaterThan } } } ``` I've successfully abstracted the process of comparing two integers and if I wanted to write a sort function ```java public static int[] sort(int[] xs) { // somewhere i can call new IntComparator.compare(...); and use that for ordering } ``` I could use that abstracted comparison behavior. But if I wanted the sort function to "be abstracted" over how it compares these integers I could use an interface ```java public interface IntComparator { IntOrder compare(int a, int b); } ... public final class NormalIntComparator implements IntComparator { public IntOrder compare(int a, int b) { if (a < b) { return LessThan; } else if (a == b) { return EqualTo; } else { return GreaterThan } } } ... public static int[] sort(int[] xs, IntComparator comparator) { // somewhere i can call comparator.compare(...); and use that for ordering } ``` And this would let me write many different implementations of comparing ints. For instance, maybe comparing them in reverse order and treat them "the same" in some other abstraction. So polymorphism has utility in building abstractions and interfaces give you polymorphism. Abstract classes give you polymorphism and code sharing, but their primary utility is code sharing. You can still make abstractions and get polymorphism in non OOP languages your mechanisms for doing so will just be different. > I'm taking in what all you said. From the readings I've done, I've always thought that Abstraction was apart of OOP. In your defense, a large part of why OOP and specifically Java is so omnipresent is heavy degrees of marketing and hype. A lot of what you read probably never stops to point out nuance since its all somewhat derived from marketing. Tue, 08 Jun 0021 05:00:00 +0000How to connect opencv in Python to Java Swinghttps://mccue.dev/pages/6-6-21-connect-python-to-swing ## Question from ☮𝙬𝙖𝙟𝙚𝙚𝙝𝙖☮#5024 > hey people. so, i did my backend of the project in opencv-python and developed the front end using swing java. now i want to connect them. can somebody guide me how to do this? This is, conceptually, what an API is for. Not necessarily an "api served over http" but just the general thrust of "these are two independent processes that will communicate via some mechanism". More details about how you want these things to communicate would be helpful - are they going to be run on the same machine, etc. > Yes I want them to run on the same machine > > I searched a little bit and found some method > > Using execute shell command > > But I'm not receiving the output I want > > Maybe I'm doing something wrong idk From the Java side you can launch the python app using `ProcessBuilder` and then. communicate that way. https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/lang/ProcessBuilder.html > Ok I will try this thanks for the link. That's just step 1, but yeah start there. Sun, 06 Jun 0021 05:00:00 +0000How to make infinite character combinationshttps://mccue.dev/pages/6-1-21-infinite-char-combinations ## Question from Bulver#9256 > Hey, please forgive my bad english. I am trying to get an infinite amount of char combinations which are stored in an array. I cant figure out how i can implement this in one method. Does anyone have ideas? > >```java >public static void one(char[] arr){ > for(int i = 0; i<95; i++){ > String a = ""+arr[i]; > if(hash2(a)%m==0){ > System.out.println(a); > } > } > } > > public static void two(char[] arr){ > for(int i = 0; i<95; i++){ > for(int j = 0;j<95;j++){ > String a = ""+arr[i]+arr[j]; > if(hash2(a)%m==0){ > System.out.println(a); > } > } > } > } > ``` > > any ideas on how to infinitely stack for loops? > > I don't know how many digits i need > > I'd like to have one method which runs till i stop it and checks every combination > > Do you understand my problem? > > ```java > public static void three(char[] arr){ > for(int i = 0; i<95; i++){ > for(int j = 0;j<95;j++){ > for(int k = 0; k<95;k++){ > String a = ""+arr[i]+arr[j]+arr[k]; > if(hash2(a)%m==0){ > System.out.println(a); > } > } > } > } > } > > ``` > There must be a better way to check for more digits One way - Instead of having all the digits be "on the stack" ```java String a = ""+arr[i]+arr[j]+arr[k]; ``` store each loop's state in an object ```java public final class CharIterator implements Iterator<String> { private int i; public boolean hasNext() { return i < 95; } public String next() { char c = (char) i; i++; return Character.toString(c); } } ``` and then all you need is a way to 1. make an Iterable from that (should be fairly simple) 2. chain two iterables into one larger iterable basically make this ```java public final class IterableChain<T> implements Iterable<T> { public <A, B> IterableChain(Iterable<A> i1, Iterable<B> i2, Function<A, B, T> combine) { } // ... } ``` and then all you'll need to do is ```java Iterable<String> iterable = new CharIterable(); for (int i = 0; i < 10; i++) { iterable = new IterableChain(new CharIterable(), iterable); } for (String s : iterable) { ... } ``` does that make sense? > Yes, thanks alot!! Tue, 01 Jun 0021 05:00:00 +0000What are Servletshttps://mccue.dev/pages/5-31-21-what-are-servlets-and-jetty ## Question from Gergő#5263 > What does Servlet mean? > > And DispatchServlet It's kinda historical. It's a term for a small, pluggable bit of what would be a larger server. The only aspect you should need to care about is that it is the interface by which you can attach to Jetty and handle http requests. > thanks!, and what is jetty? 🧐 It's a Java web server. You would have Jetty handle connecting to whatever ports and parsing http requests and your code in the "servlet" would decide how to respond to those requests. Mon, 31 May 0021 05:00:00 +0000Basic Cleaner Examplehttps://mccue.dev/pages/5-7-21-basic-cleaner-example ```java import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.IOException; import java.lang.ref.Cleaner; public final class HasResource implements AutoCloseable { // This has a Thread backing it so you want to share your cleaner across the whole lib private static final Cleaner CLEANER = Cleaner.create(); private final FileOutputStream outputStream; private final Cleaner.Cleanable cleanable; // You don't want your cleaner task to be a lambda so you don't accidentally capture a // ref. to the wrapping object. private static final class CleanerTask implements Runnable { private final FileOutputStream outputStream; private CleanerTask(FileOutputStream outputStream) { this.outputStream = outputStream; } @Override public void run() { try { this.outputStream.close(); } catch (IOException e) { throw new RuntimeException(e); } } } public HasResource() { try { this.outputStream = new FileOutputStream("ex.csv"); } catch (FileNotFoundException e) { throw new RuntimeException(e); } this.cleanable = CLEANER.register(this, new CleanerTask(this.outputStream)); } @Override public void close() { // This has "at most once" semantics so if they close it the cleaner won't run your cleanup logic a second time this.cleanable.clean(); } } ``` Fri, 07 May 0021 05:00:00 +0000The Problem with Annotation Processorshttps://mccue.dev/pages/3-18-21-the-problem-with-annotation-processors For reasons unknown, broaching the subject of annotation processors seem to elicit some primordial fear in developers. People tend to associate annotation processing with borderline witchcraft and sorcery perform-able only by most adept of basement wizards. It doesn’t have to be that way. Annotation processing doesn’t have to be the big scary monster hiding under the bed. <figure> <img src="/pages/3-18-21-bed.png" alt="Image taken from https://sourcesofinsight.com/monsters-under-the-bed/"/> <figcaption>Image taken from <a href="Image taken from https://sourcesofinsight.com/monsters-under-the-bed/">https://sourcesofinsight.com/monsters-under-the-bed/</a></figcaption> </figure> No doubt, problems with annotation processing _do_ exist, but so do solutions to those problems. One problem that stands out in particular, is the difficulty in unit testing annotation processors. A problem that [elementary](https://github.com/Pante/elementary), a suite of JUnit 5 extensions, solves. ## What’s this Annotation Processing Thingamajig? For the uninitiated, an annotation processor is similar to a compiler plug-in. Like it’s namesake, it can be called by the compiler to _process_ annotations, i.e. `@Nullable` during compilation. Said process covers an extremely broad and vague expanse. Everything from simple value validation to a full-blown pluggable type system like the [checker-framework](https://github.com/typetools/checker-framework). A simple `@Builder` annotation builder to full-blown dependency injection via code generation like [Dagger](https://github.com/google/dagger). Post Java 9, it resides inside the [`java.compiler`](https://docs.oracle.com/en/java/javase/11/docs/api/java.compiler/module-summary.html) module. Inside an annotation processor lies the fabled domain of [`Element`](https://docs.oracle.com/en/java/javase/11/docs/api/java.compiler/javax/lang/model/element/package-summary.html)s and [`TypeMirror`](https://docs.oracle.com/en/java/javase/11/docs/api/java.compiler/javax/lang/model/type/package-summary.html)s, abstract Syntax Tree (AST) representations of the Java language and counterparts to the reflection framework found in Javaland. `Element`s represent syntactical constructs such as methods, arrays etc. while `TypeMirror`s represent, well, types such as reference types (classes) and primitives but we digress. ## Why So Difficult? So what makes testing annotation processing so difficult? In our opinion, everything about the annotation processing environment. We’re not claiming that the environment is some evil grotesque being, it’s actually surprisingly well-designed. The problem lies squarely with the unavailability of the environment outside the compiler. Without its environment, testing an annotation processor is a lost cause. A good drinking game is taking a shot for each method call in an annotation processor that requires an annotation processing environment. ```java import com.karuslabs.utilitary.Logger; import com.karuslabs.utilitary.type.TypeMirrors; import java.util.Set; import javax.annotation.processing.AbstractProcessor; import javax.annotation.processing.ProcessingEnvironment; import javax.annotation.processing.RoundEnvironment; import javax.lang.model.element.TypeElement; import javax.lang.model.element.VariableElement; import javax.lang.model.util.Elements; class StringFieldLint extends AbstractProcessor { Elements elements; TypeMirrors types; Logger logger; @Override public void init(ProcessingEnvironment environment) { super.init(environment); elements = environment.getElementUtils(); // (1) types = new TypeMirrors(elements, environment.getTypeUtils()); // (2) logger = new Logger(environment.getMessager()); // (3) } @Override public boolean process(Set<? extends TypeElement> set, RoundEnvironment round) { var elements = round.getElementsAnnotatedWith(Case.class); // (4) for (var element : elements) { if (!(element instanceof VariableElement)) { logger.error(element, "Element is not a variable"); // (5) continue; } var variable = (VariableElement) element; if (!types.isSameType(variable.asType(), types.type(String.class))) { // (6) (7) (8) logger.error(element, "Element is not a string"); // (9) continue; } } return false; } } ``` Pretty much everything requires an annotation processing environment as illustrated above. At this junction, we have four solutions to overcome this pickle: * Don’t bother with unit testing * Wait for something, anything to happen * Mock/re-implement the annotation processing environment * Smuggle the annotation processing environment out of the compiler To keep a long story short, we ended up becoming smugglers. ## Smuggler’s Discovery While trawling the web, we discovered Google’s [compile-testing](https://github.com/google/compile-testing) project, a hidden gem buried beneath the swathes of GitHub projects. Through some clever hacks, the project managed to provide an annotation processing environment for unit tests albeit a little lackluster and limited. Exploring the project, it became obvious that it wasn’t the panacea that we had hoped. The project suffered from a few limitations that we weren’t able to stomach: * Supports only JUnit 4. The annotation processing environment is only available through a JUnit rule, something that is no longer supported in JUnit 5. We have been using JUnit 5 for the longest time and don’t intend to downgrade anytime soon. * The utilities for working with the annotation processing environment is limited. It _works_, but it can be significantly more ergonomic. * Inability to traverse the `Element`s and `TypeMirror`s of compiled files in a test. This is essential to allow compiled files to be used as test cases. * Scope limitation of the annotation processing environment. The annotation processing environment is limited to the scope of a test method. This is inconvenient as initialization of test state cannot be shared between multiple tests. Furthermore, the design lends itself to unexpected behaviour. ```java class SomeTest { @Rule CompilationRule rule = new CompilationRule(); Types types = rule.getTypes(); // Throws an exception when the method can be called @Test void test() { ... } } ``` This isn’t to say that the project is _bad_, just that our objectives are different. In fact, some parts of elementary is based on compile-testing. As its name implies, compile-testing focuses on testing the compilation of code, not annotation processing. That’s not our objective. Our objective is to simplify unit testing annotation processors. Thus, after a healthy dose of _“Hold my beer”_ and [_Not Invented Here Syndrome_](https://docs.oracle.com/javase/8/docs/api/java/util/logging/package-summary.html), the elementary project was conceived. ## Elementary, My Dear Watson With compile-testing as a foundation, we embarked on a quest to bring Elementary to life. Starting with a clean slate blessed us with the freedom to make decisions that would otherwise incite an angry mob with pitchforks and torches: * Support only Java 11 & above. The module system in Java 9 introduced some breaking changes to the `jdk.compiler` module and `ClassLoader`s. We don't want to deal with that. * Support only JUnit 5. We do not want to support a JUnit 4 equivalent that we do not use. Our experience working on [Chimera code generation tool](https://github.com/Pante/chimera) told us that tests for annotation processors fell into the classic black-box and white-box testing categories. For small and/or simple annotation processors, it was more efficient to invoke the annotation processor inside a compiler against sample Java source files. As the complexity and size of an annotation processor increases, running the annotation processor against sample files yields diminishing returns. It will be far less tedious to isolate and test the individual logical components. Two different categories with two completely different sets of requirements. <h2 id="box-of-fun"> Box of Fun Things </h2> Black-box testing annotation processors _can_ be fun. It _doesn’t_ have to be a myriad of set-up, tear-down and configuration. Not accordingly to `JavacExtension` at least. For each test, `JavacExtension` compiles a suite of test cases with the given annotation processor(s). The results of the compilation is then funneled to the test method for subsequent assertions. All configuration is handled via annotations with no additional set-up or tear-down required. > They say seeing is believing so let’s get on with the seeing. Our imaginary annotation processor is fairly straightforward. All it does is check whether an element that is annotated with `@Case` is also a string field. If an element isn't a string or variable, an error message is printed. Since it's _that_ straightforward, just black-box testing our annotation processor is enough. ```java @SupportedAnnotationTypes({"*"}) class ImaginaryProcessor extends AnnotationProcessor { @Override public boolean process(Set<? extends TypeElement> set, RoundEnvironment round) { var elements = round.getElementsAnnotatedWith(Case.class); for (var element : elements) { if (element instanceof VariableElement)) { var variable = (VariableElement) element; if (!types.isSameType(variable.asType(), types.type(String.class))) { logger.error(element, "Element is not a string"); } } else { logger.error(element, "Element is not a variable"); } } return false; } } ``` Testing our imaginary annotation processor isn’t too difficult either. All we need to do is to sprinkle a few annotations on the test class, create some test cases, check the compilation results, and Voila! We’re done. ```java import com.karuslabs.elementary.Results; import com.karuslabs.elementary.junit.JavacExtension; import com.karuslabs.elementary.junit.annotations.Case; import com.karuslabs.elementary.junit.annotations.Classpath; import com.karuslabs.elementary.junit.annotations.Options; import com.karuslabs.elementary.junit.annotations.Processors; @ExtendWith(JavacExtension.class) @Options("-Werror") @Processors({ImaginaryProcessor.class}) @Classpath("my.package.ValidCase") class ImaginaryTest { @Test void process_string_field(Results results) { assertEquals(0, results.find().errors().count()); } @Test @Classpath("my.package.InvalidCase") void process_int_field(Results results) { assertEquals(1, results.find().errors().contains("Element is not a string").count()); } } ``` Let’s break down the code snippet. * By annotating the test class with `@Options`, we can specify the [compiler flags](https://docs.oracle.com/en/java/javase/11/tools/javac.html) used when compiling the test cases. In this snippet, `-Werror` indicates that all warnings will be treated as errors. * To specify which annotation processor(s) is to be invoked with the compiler, we can annotate the test class with `@Processors`. No prizes for correctly guessing which annotation processor in this snippet. * Test cases can be included for compilation by annotating the test class with either `@Classpath` or `@Inline`. Java source files on the classpath can be included using `@Classpath` while strings inside `@Inline` can be transformed into an inline source file for compilation. In this snippet, both [`ValidCase`](https://github.com/Pante/elementary/blob/master/elementary/src/test/resources/ValidCase.java) and [`InvalidCase`](https://github.com/Pante/elementary/blob/master/elementary/src/test/resources/InvalidCase.java) is included for compilation. * An annotation’s scope is tied to its target’s scope. If a test class is annotated, the annotation will be applied for all test methods in that class. On the same note, an annotation on a test method will only be applied on said method. * `Results` represent the results of a compilation. We can specify `Results` as a parameter of test methods to obtain the compilation results. In this snippet, `process_string_field(...)` will receive the results for `ValidCase` while `process_int_field(...)` will receive the results for both `ValidCase` and `InvalidCase`. <h2 id="pandoras-box"> Pandora’s Box </h2> This is where things become really interesting. White-box testing isn’t as simple as invoking an annotation processor since the possibilities of what a test is trying to prove is unlimited. In a black-box test, we need only to prove that the compilation results of a known annotation processor against a fixed number of files matches certain criterion. On the contrary, in a white-box test, we do not know why, what and how a component is being tested. The best we can do is make the annotation processing environment accessible inside the test class. > “It can’t be that difficult to allow class scoped annotation processing environments, compile-testing already does that.” We too, initially felt the same way and boy, were we wrong. While compile-testing does provide an annotation processing environment, it is limited to the scope of a test method. Not being able to access said environment outside of methods means repetitive and verbose initialization code, which blows. Sadly, we couldn’t just tweak compile-testing’s trick either as it was found to be incompatible with our objective. The secret sauce behind compile-testing is actually pretty straightforward. Each test method is intercepted by a [JUnit rule](https://github.com/google/compile-testing/blob/master/src/main/java/com/google/testing/compile/CompilationRule.java) and wrapped in an annotation processor that invokes the method during processing. The test is subsequently executed inside a compiler that the JUnit rule invokes. Unfortunately, in this technique, an annotation processing environment is available only when a test method. It isn’t possible to tweak the technique to intercept the creation of a test instance and inject the test instance inside an annotation processor either due to the constraints of the JUnit lifecycle. A great deal of time spent at the drawing board later, we succeeded in creating the `ToolsExtension`. This extension exploited the fact that a test instance only needed access to an annotation processing environment. Tests didn't need to be executed inside an annotation processor. Once we established that, our trick was run a compiler with a blocking annotation processor on a daemon thread before each test instance was created. With compilation suspended inside the processor, the environment is made accessible to the test instance on the main thread. Only after all tests has been executed does compilation resume. <figure> <img width="100%" src="/pages/3-18-21-diagram.png" alt="Here’s a poorly drawn MS Paint diagram illustrating the entire process"/> <figcaption>Here’s a poorly drawn MS Paint diagram illustrating the entire process</figcaption> </figure> Let’s pretend that as a result of the imaginary processor we described in [Box of Fun Things](#box-of-fun) having grown in scope and size, it was refactored into multiple components, one of which checks if an element is a string variable like the original annotation processor. ```java class Lint { final TypeMirrors types; final TypeMirror expectedType; Lint(TypeMirrors types) { this.types = types; this.expectedType = types.type(String.class); } public boolean lint(Element element) { if (!(element instanceof VariableElement)) { return false; } var variable = (VariableElement) element; return types.isSameType(expectedType, variable.asType()); } } ``` Using the `ToolsExtension` to test the annotation processor yields the following code snippet: ```java import com.karuslabs.elementary.junit.Cases; import com.karuslabs.elementary.junit.Tools; import com.karuslabs.elementary.junit.ToolsExtension; import com.karuslabs.elementary.junit.annotations.Inline; import com.karuslabs.utilitary.type.TypeMirrors; @ExtendWith(ToolsExtension.class) @Inline(name = "Samples", source = { "import com.karuslabs.elementary.junit.annotations.Case;", "", "class Samples {", " @Case(\"first\") String first;", " @Case String second() { return \"\";}", "}"}) class ToolsExtensionExampleTest { Lint lint = new Lint(Tools.typeMirrors()); @Test void lint_string_variable(Cases cases) { var first = cases.one("first"); assertTrue(lint.lint(first)); } @Test void lint_method_that_returns_string(Cases cases) { var second = cases.get(1); assertFalse(lint.lint(second)); } } ``` Let’s break down the code snippet: * By annotating the class with `@Inline` we can specify an inline Java source file which `ToolsExtension` includes for compilation. * The annotation processing environment can be accessed via either the `Tools` class or dependency injection into the test class's constructor or test methods. In this case, we access the current `TypeMirrors` using the static method on `Tools`. * An in-depth explanation for both `@Case` and `Cases` will be provided in the following section. For now, it's just the mechanism used to find elements in compiled files. ## The Case for Cases With the completion of `ToolsExtension`, we succeeded in our quest to smuggle an annotation processing environment out of the compiler. Yet one final piece in the puzzle still remains. How do we create those elements to test our code against? The `jdk.compiler` module doesn't provide a way to create elements. While mocking an `Element` is possible it is far from developer-friendly. Not only is the initialization verbose, unwieldy and convoluted, it is also difficult to guarantee that the mocked element's behaviour matches its actual counterpart. We can't look to compile=testing for guidance either since it doesn't provide anything like that. After much headache, we managed to find the missing piece. Let’s have the compiler transform our test cases written in idiomatic Java into elements for us. That way, we avoid the mess surrounding the initialization of elements and the resultant code is far easier to understand. To achieve that, we required some way to fetch elements from the compiler. After further refinement of the concept, we eventually developed the `Cases` class and corresponding `@Case` annotation. Returning to our code snippet from [Pandora’s Box](#pandoras-box), let’s analyze it in greater detail. ```java import com.karuslabs.elementary.junit.Cases; import com.karuslabs.elementary.junit.Tools; import com.karuslabs.elementary.junit.ToolsExtension; import com.karuslabs.elementary.junit.annotations.Inline; import com.karuslabs.utilitary.type.TypeMirrors; @ExtendWith(ToolsExtension.class) @Inline(name = "Samples", source = { "import com.karuslabs.elementary.junit.annotations.Case;", "", "class Samples {", " @Case(\"first\") String first;", " @Case String second() { return \"\";}", "}"}) class ToolsExtensionExampleTest { Lint lint = new Lint(Tools.typeMirrors()); @Test void lint_string_variable(Cases cases) { var first = cases.one("first"); assertTrue(lint.lint(first)); } @Test void lint_method_that_returns_string(Cases cases) { var second = cases.get(1); assertFalse(lint.lint(second)); } } ``` * By annotating a test case with `@Case` inside a Java source file, we can fetch its corresponding element from `Cases`. A `@Case` may also contain a label to simplify retrieval. * Through `Cases`, we can fetch elements by either the label or index of the case. We can obtain an instance of `Cases` via `Tools.cases` or like in this code snippet, through dependency injection. ## Idea Graveyard As mentioned at the beginning of this article, we explored a few other avenues which eventually led to dead-ends. We thought them to be interesting enough to discuss in the following sections. Most of them ended up getting shelved due to the impracticality and unacceptable trade-offs for the solution. Not testing annotation processors goes without saying to be a terrible choice. Just because testing them is difficult doesn’t give us the liberty of skipping that. The problems will only worsen over time if we choose to take the easy route out. Furthermore, most annotation processors usually do code generation and static type analysis. Both of which are extremely difficult to troubleshoot. > “Good things come to those wait. But better things come to those who work for it.” Had [JEP 119: javax.lang.model Implementation Backed by Core Reflection](https://openjdk.java.net/jeps/119) been shipped with JDK 8, I highly doubt elementary would have even been conceived. It solved the issue with accessing an annotation processing environment outside of a compiler by providing a standard implementation. Sadly, it was shelved and future efforts seems to have stalled. A wait and see approach to unit testing annotation processors would thus be unfeasible as there isn’t anything to wait on. A problem more difficult than testing annotation processing is trying to mock/re-implement the annotation processing environment. Since elements represent an AST for the Java language, we need to be intimate with the language specification to guarantee that the behaviour of mocked/re-implemented elements do not deviate from the original. This honestly makes testing annotation processors seem like a Disney fairy-tale, we don’t want to touch that even with a ten-foot pole. A few existing re-implementations do exist but seem to have been long-abandoned for years. In the end, it boils down to the troubles outweighing the benefits that led us to abandon this avenue. ## Final Thoughts We’ve reached the end of our journey to simplify the testing of annotation processors. Looking back, it has been an absolute blast working on the project. How adopted this project is still remains to be seen. But if anything, I hope that this article encouraged you start playing around with annotation processors. In summary, Elementary introduces: * The `JavacExtension` for black-box testing and testing of simple annotation processors. * A class-scoped annotation processing environment for test classes annotated with `ToolsExtension`. Utilities for fetching elements from the compiler to the test class That said, this is only beginning of yet another journey. A journey that I am hopeful will bring many new feature and improvements to elementary in the time to come. Until the next time, happy coding! --- Article was originally published on [Medium](https://matthiasngeo.medium.com/the-problem-with-annotation-processors-802548a3bfdb). *Shameless advertising* This article is based on Elementary, https://github.com/Pante/elementary. Thu, 18 Mar 0021 05:00:00 +0000Should array variable names be pluralhttps://mccue.dev/pages/3-23-19-should-array-names-be-plural ## Question from Abdul#4709 > Just wondering for arrays is it considered a better practice to keep the var names plural? > > Or I guess it depends on language? For me, when I name a collection of things it usually ends up that using plural describes it better, but it all depends on what you are naming. I wouldn't make a collection of cats and call it `cat`. That's just confusing. But if you are keeping some messages in a queue `messageQueue` works as a name since having the word queue implies both the direction data flows and the idea of possible plurality. Sat, 23 Mar 0019 05:00:00 +0000Should a game use mutable statehttps://mccue.dev/pages/3-22-19-should-a-game-use-mutable-state ## Question from decentDrei#7560 > Would you say that a basic game, like a canvas game should use mutable state? > > 2d physics. Standard side-scroll Pretty solidly going to say not to worry about it. Games were made forever with a global mutable area for gamestate. https://youtu.be/aKLntZcp27M This talk has some good pointers on game dev. It's part of a much larger topic but the speaker is good. And there are gradients to avoiding mutation. Don't avoid it like the plague, just keep it in mind. Fri, 22 Mar 0019 05:00:00 +0000JS vs Java - dynamic typinghttps://mccue.dev/pages/3-22-19-js-vs-java ## Conversation with somebody#0002 So as far as JS and Java are concerned the syntaxes of the two languages are very similar but the semantics between the two vary wildly even discounting browser weirdness. > but for simple things it's very similar. small algorithms and things like that For simple things all languages are similar okay, so here's a bet. I will write a bit of javascript code. very small, very simple. You will try to write some mock Java that does the same thing. I bet that you will find it harder to do ```javascript function upperCaseName(entity) { return {...entity, name: entity.name.toUpperCase() } } const dog = { name: "Fido", favorite_toy: "Squeeks" } const person { name: "Bob", majored_in: "Physics" , age: 30 } upperCaseName(dog) // { name: "FIDO", favorite_toy: "Squeeks" } upperCaseName(person) // { name: "BOB", majored_in: "Physics" , age: 30 } ``` > Objects are a JS-specific thing, but... that should be doable in Java. won't look nice, won't behave well, but it'll be doable oh so the thing that most libraries in JS use as their primary data representation are JS specific? > yes > > in Java you'd use a POD for them > > in JS you don't want that overhead, in Java it's kinda unavoidable POD? > plain old data You mean a class with getters and setters right? That won't work here. Dynamic languages have a whole set of design patterns unique to their inherit flexibility in the same way Functional Languages like Haskell have a whole set of design patterns unique to their inherit restrictions. Javascript is dynamic and weakly typed and Java is static and strongly typed. The second bit there is very important because it clues to the underlying semantics of the language. > yeah, of course. > > also kinda working in Java: > ```java > import java.util.Map; > import java.util.HashMap; > > public class Main { > public static void main(String[] args) { > System.out.println(upperCaseName(new HashMap<String, Object>() {{ put("name", "Fido"); put("favorite_toy", "Squeeks"); }})); > } > > static Map<String, Object> upperCaseName(Map<String, Object> map) { > if (map.containsKey("name") && map.get("name") instanceof String) { > map.put("name", ((String) map.get("name")).toUpperCase()); > } > return map; > } > } > ``` In Javascript saying `class` has an insanely different meaning than in Javascript. In java when you say class you are declaring a template for a concrete object to be created later. You are saying that your object will have these slots, that your object will have these methods that will work on said slots of data, and you define how the object will be constructed. But in Javascript you aren't doing that. In Javascript you are actually creating an object. That object is the "prototype" for new objects to be created from and you are saying "hey just copy this object" > that's the prototype, not the class There is no such thing as the class Not in any form that isn't syntax sugar. The main similarity between JS and Java is the syntax, which was done on purpose. Hence the misleading name of JavaScript. The whole point was to _look_ like Have at first glance but at its core JS is more a badly implemented lisp than anything else. Now I am going to do the dog example in idiomatic java give me a few minutes ```java import java.util.Objects; interface HasName<T extends HasName<T>> { String getName(); T withName(String name); } class Dog implements HasName<Dog> { private final String name; private final String favoriteToy; Dog(String name, String favoriteToy) { Objects.requireNonNull(name, "Name should not be null"); Objects.requireNonNull(favoriteToy, "Favorite Toy should not be null"); this.name = name; this.favoriteToy = favoriteToy; } @Override public String getName() { return this.name; } @Override public Dog withName(String name) { return new Dog(name, this.favoriteToy); } public String getFavoriteToy() { return this.favoriteToy; } public Dog withFavoriteToy(String favoriteToy) { return new Dog(this.name, favoriteToy); } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; Dog dog = (Dog) o; return Objects.equals(name, dog.name) && Objects.equals(favoriteToy, dog.favoriteToy); } @Override public int hashCode() { return Objects.hash(name, favoriteToy); } @Override public String toString() { return "Dog{" + "name='" + name + '\'' + ", favoriteToy='" + favoriteToy + '\'' + '}'; } } class Person implements HasName<Person> { private String name; private String majoredIn; private int age; Person(String name, String majoredIn, int age) { Objects.requireNonNull(name, "Name should not be null"); Objects.requireNonNull(majoredIn, "Majored In should not be null"); this.name = name; this.majoredIn = majoredIn; this.age = age; } @Override public String getName() { return this.name; } @Override public Person withName(String name) { return new Person(name, this.majoredIn, this.age); } public String getMajoredIn() { return this.majoredIn; } public Person withMajoredIn(String majoredIn) { return new Person(this.name, majoredIn, this.age); } public int getAge() { return this.age; } public Person withAge(int age) { return new Person(this.name, this.majoredIn, age); } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; Person person = (Person) o; return age == person.age && Objects.equals(name, person.name) && Objects.equals(majoredIn, person.majoredIn); } @Override public int hashCode() { return Objects.hash(name, majoredIn, age); } @Override public String toString() { return "Person{" + "name='" + name + '\'' + ", majoredIn='" + majoredIn + '\'' + ", age=" + age + '}'; } } public class Main { private static <T extends HasName<T>> T upperCaseName(HasName<T> entity) { return entity.withName(entity.getName().toUpperCase()); } public static void main(String[] args) { Dog fido = new Dog("Fido", "Squeeks"); Person bob = new Person("Bob", "Physics", 30); System.out.println("Before:"); System.out.println(fido); System.out.println(bob); Dog fidoUpper = upperCaseName(fido); Person bobUpper = upperCaseName(bob); System.out.println("After:"); System.out.println(fidoUpper); System.out.println(bobUpper); } } ``` Fri, 22 Mar 0019 05:00:00 +0000How to do abstractions in JShttps://mccue.dev/pages/3-22-19-how-to-do-abstractions-in-js ## Question from ForkyFork#7118 > javascript does not have good OOP, no types > > e.g. how are you going to do abstraction in js? interfaces? You really haven't been around the block enough to compare languages to Java. So here is the issue with that as a question. When you say "abstraction", what you are referring to are the static typing constructs in Java that allow you to declare the relationships between objects. Just because JS lacks the ability to declare those constructs in a way that can be checked by a compiler doesn't mean you can't write abstracted code in it. Fri, 22 Mar 0019 05:00:00 +0000How to append to an array in Chttps://mccue.dev/pages/3-22-19-how-to-append-to-an-array-in-c ## Question from Deleted User > is there a way to append to an array in C > > not linked lists > > i have a callback function which needs to add an item to an array every time it's called > > i can't figure out how to do that Conceptually an array is not a good fit for continual appending. If you want to do an operation like that you would be best off writing or finding your own "ArrayList" kind of wrapper. > is there a way to find the index of the last item in the list > > array* An array in c is a fixed size block of memory that you have a pointer to the start of. It is up to you to keep track of the size of that array usually in a separate `int`. Because you only have the pointer to the start of the block of memory, nothing about that pointer can tell you how much memory ahead of that you are allowed to access. So if you keep the size of the array and the pointer to the start of the array held somewhere then the "index of the last item" is something you implicitly know. I might be wrong when I say to store it in an `int` it might be a `usize` or some other type. > time to check out rust > > if anyone can help me out with this i might consider going back to C The type you are looking for in rust is probably `Vec`. Can you share your problem and what you were thinking of for a solution? I can maybe whip up a quick A/B of what it would be in `C` vs `Rust`. > I have a function which retrieves a few rows from a database and uses a callback function to individually process each row. The callback function is supposed to map the row's data to a structure and add it to an array. > > this is what i was trying to do ```c void handle_entry(void *entries, int argc, char **argv, char **column_name){ Entry entry; strcpy(entry.title, argv[1]); strcpy(entry.content, argv[2]); } void get_entries(){ sqlite3 *DB; sqlite3_stmt *stmt; char *sql = "SELECT * FROM Entries;"; Entry *entries[100]; sqlite3_open("entries.db", &DB); sqlite3_exec(DB, sql, handle_entry, entries, 0); ``` What is the schema for entries? > ID, Title, Content > ```c > typedef struct Entry { > int ID; > char *title; > char *content; > } Entry; > ``` > ```sql > CREATE TABLE Entries( > ID INTEGER PRIMARY KEY, > Title CHAR(512), > Content TEXT); > ``` Here ya go ```rust use rusqlite::{Connection, NO_PARAMS}; #[derive(Debug)] struct Entry { id: i64, title: String, content: String } fn insert_entry(conn: &Connection, entry: &Entry) -> rusqlite::Result<usize> { conn.execute("\ INSERT INTO Entries (Title, Content) VALUES (?1, ?2) ", &[&entry.title, &entry.content]) } fn all_entries(conn: &Connection) -> rusqlite::Result<Vec<Entry>> { let mut stmt = conn .prepare("SELECT ID, Title, Content FROM Entries")?; let entry_iter = stmt.query_map( NO_PARAMS, |row| Ok(Entry{ id: row.get(0)?, title: row.get(1)?, content: row.get(2)? }) )?; let mut entries = Vec::new(); for entry in entry_iter { entries.push(entry?); } Ok(entries) } fn main() -> rusqlite::Result<()> { let conn = Connection::open("db.sqlite")?; conn.execute("\ CREATE TABLE IF NOT EXISTS Entries( ID INTEGER PRIMARY KEY, Title CHAR(512), Content TEXT); ", NO_PARAMS)?; let entry_1 = Entry { id: 0, title: String::from("Entry Number One"), content: String::from("This is the text for my entry.") }; let entry_2 = Entry { id: 1, title: String::from("Le Second Entreee"), content: String::from("2. What is 2? Can you taste it?") }; insert_entry(&conn,&entry_1)?; insert_entry(&conn,&entry_2)?; println!("{:?}", all_entries(&conn)); Ok(()) } ``` ``` /Users/emccue/.cargo/bin/cargo run --color=always --package sequel --bin sequel Compiling sequel v0.1.0 (/Users/emccue/Development/sequel) Finished dev [unoptimized + debuginfo] target(s) in 1.63s Running `target/debug/sequel` Ok([Entry { id: 1, title: "Entry Number One", content: "This is the text for my entry." }, Entry { id: 2, title: "Le Second Entreee", content: "2. What is 2? Can you taste it?" }]) Process finished with exit code 0 ``` > that's gonna take a long time to understand > for me > > im still in ch one in the rust tutorial That's fair. C is definitely a simpler language in terms of number of concepts you need to understand. Fri, 22 Mar 0019 05:00:00 +0000I like it when "if" is an expressionhttps://mccue.dev/pages/3-13-19-if-expression One of the small conveniences ive grown to really like in a programming language is "if" being an expression. I found myself hacking it into some JS I was writing today. ```javascript const thing = (() => { if (condition) { return "hello"; } else { return "world"; } })(); ``` Now, feel free to weigh in on that being bad practice. > as a non JS programmer I have no idea what "thing = (() => {" is supposed to do That creates an anonymous function then calls it immediately. Wed, 13 Mar 0019 05:00:00 +0000The Ferry Problem in Rusthttps://mccue.dev/pages/3-11-19-ferry-problem-in-rust ## Someone else's C homework done in rust ```rust // Ferry Loading // Before bridges were common, ferries were used to transport cars across rivers. // River ferries, unlike their larger cousins, run on a guide line and are powered by the // river's current. Cars drive onto the ferry from one end, the ferry crosses the river, and // the cars exit from the other end of the ferry. // There is an l-meter-long ferry that crosses the river. A car may arrive at either // river bank to be transported by the ferry to the opposite bank. The ferry travels // continuously back and forth between the banks so long as it is carrying a car or there // is at least one car waiting at either bank. Whenever the ferry arrives at one of the // banks, it unloads its cargo and loads up cars that are waiting to cross as long as they // fit on its deck. The cars are loaded in the order of their arrival and the ferry's deck // accommodates only one lane of cars. The ferry is initially on the left bank where it // had mechanical problems and it took quite some time to fix it. In the meantime, lines // of cars formed on both banks that wait to cross the river. // The first line of input contains c, the number of test cases. Each test case begins // with the number l, a space and then the number m. m lines follow describing the cars // that arrive in this order to be transported. Each line gives the length of a car (in // centimeters), and the bank at which the car awaits the ferry ("left" or "right"). // For each test case, output one line giving the number of times the ferry has to cross // the river in order to serve all waiting cars. // Sample input // 4 // 20 4 // 380 left // 720 left // 1340 right // 1040 left // 15 4 // 380 left // 720 left // 1340 right // 1040 left // 15 4 // 380 left // 720 left // 1340 left // 1040 left // 15 4 // 380 right // 720 right // 1340 right // 1040 right use std::collections::LinkedList; use std::error::Error; use std::io::BufRead; #[derive(Debug, PartialEq, Eq)] enum RiverBank { Left, Right, } impl RiverBank { fn try_from(value: &str) -> Result<RiverBank, Box<Error>> { match value { "left" => Ok(RiverBank::Left), "right" => Ok(RiverBank::Right), _ => Err("A river bank must either be \"left\" or \"right\"".into()), } } fn switch(&self) -> RiverBank { match self { &RiverBank::Left => RiverBank::Right, &RiverBank::Right => RiverBank::Left, } } } type CarLength = u64; type FerryLength = u64; #[derive(Debug)] struct StartingCarState { river_bank: RiverBank, car_length: CarLength, } #[derive(Debug)] struct FerryProblem { ferry_length: FerryLength, car_descriptions: Vec<StartingCarState>, } fn read_input(from: impl BufRead) -> Result<Vec<FerryProblem>, Box<Error>> { let mut lines = from.lines(); let first_line = match lines.next() { Some(Ok(line)) => line, Some(Err(err)) => { return Err(err.into()); } None => return Err("There was no first line provided to stdin".into()), }; let number_of_test_cases: u64 = first_line .parse() .map_err(|_| "The first line needs to be a single non-negative number.")?; let mut ferry_problems = Vec::new(); for _ in 0..number_of_test_cases { let ferry_description_line = match lines.next() { Some(Ok(line)) => line, Some(Err(err)) => { return Err(err.into()); } None => { return Err("Ran out of input when parsing test cases".into()); } }; let (ferry_length, number_of_cars) = { let split_by_whitespace: Vec<&str> = ferry_description_line.split("\\w+").collect(); if split_by_whitespace.len() == 2 { let mut ferry_length = split_by_whitespace[0].parse()?; ferry_length *= 100; let number_of_cars: u64 = split_by_whitespace[1].parse()?; (ferry_length, number_of_cars) } else { return Err("Malformed ferry description line".into()); } }; let mut car_descriptions = Vec::new(); for _ in 0..number_of_cars { let car_description_line = match lines.next() { Some(Ok(line)) => line, Some(Err(err)) => { return Err(err.into()); } None => return Err("Not enough descriptions of cars given".into()), }; let car_state = { let split_by_whitespace: Vec<&str> = car_description_line.split("\\w+").collect(); if split_by_whitespace.len() == 2 { let car_length = split_by_whitespace[0].parse()?; let river_bank = RiverBank::try_from(split_by_whitespace[1])?; StartingCarState { river_bank: river_bank, car_length: car_length, } } else { return Err("Malformed car description line".into()); } }; car_descriptions.push(car_state); } ferry_problems.push(FerryProblem { ferry_length, car_descriptions, }) } Ok(ferry_problems) } #[derive(Debug)] enum FerryProblemSolution { RequiredTrips(u64), Impossible, } fn required_crossings(problem: FerryProblem) -> FerryProblemSolution { let mut left_bank: LinkedList<CarLength> = problem .car_descriptions .iter() .filter(|car| car.river_bank == RiverBank::Left) .map(|car| car.car_length) .collect(); let mut right_bank: LinkedList<CarLength> = problem .car_descriptions .iter() .filter(|car| car.river_bank == RiverBank::Right) .map(|car| car.car_length) .collect(); let mut bank = RiverBank::Left; let mut used_capacity = 0; let mut trips_made = 0; loop { let next_car = match bank { RiverBank::Left => left_bank.pop_front(), RiverBank::Right => right_bank.pop_front(), }; match next_car { Some(car) => { if car > problem.ferry_length { return FerryProblemSolution::Impossible; } else if car + used_capacity > problem.ferry_length { trips_made += 1; used_capacity = 0; match bank { RiverBank::Left => left_bank.push_front(car), RiverBank::Right => right_bank.push_front(car), }; bank = bank.switch() } else { used_capacity += car; } } None => { trips_made += 1; let other_bank_is_empty = match bank { RiverBank::Left => right_bank.is_empty(), RiverBank::Right => left_bank.is_empty(), }; if other_bank_is_empty { return FerryProblemSolution::RequiredTrips(trips_made); } else { used_capacity = 0; bank = bank.switch() } } } } } fn main() -> Result<(), Box<Error>> { let test_input = "4 20 4 380 left 720 left 1340 right 1040 left 15 4 380 left 720 left 1340 right 1040 left 15 4 380 left 720 left 1340 left 1040 left 15 4 380 right 720 right 1340 right 1040 right "; /* replace with io::stdin().lock() for real input */ let problems = read_input(test_input.as_bytes())?; for problem in problems { let solution = required_crossings(problem); match solution { FerryProblemSolution::Impossible => println!("There is no solution"), FerryProblemSolution::RequiredTrips(trips) => println!("{}", trips), } } Ok(()) } ``` Mon, 11 Mar 0019 05:00:00 +0000Is there a better way of storing a JSON entryhttps://mccue.dev/pages/3-10-19-is-there-a-better-way-of-storing-a-json-entry ## Question from Anonymous <!-- Zombie_Pigdragon#3468 --> > ```python > class Event: > def __init__(self, name, time, location): > self.name = name > self.time = time > self.location = location > ``` > Is this a good thing to do > or should I not > (Deleted User) I'd suggest a namedtuple instead. it's much faster and efficient. > > Or dataclasses if you're using py 3.7. > How do I use a namedtuple? Actually quick disclaimer. The way python's JSON module works, it's impossible for you to dump `namedtuple`s as anything other than a list. Here you should just use a dictionary (IMO). ```python event = {"name": name, "time": time, "location": location} ``` But if I'm being honest that's just my clojure brain talking. pro: If you deserialize from json you get this anyways and you can do round-trip serde and have the same representation. con: The "shape" of the dict doesn't have a name and you can't use the `.` syntax to access stuff by default. If you want to use the dot-notation for access with a `dict` just wrap it in this lib. https://github.com/Infinidat/munch In general, you don't gain much by "dressing up" your data though named tuples are immutable, which is nice. Sun, 10 Mar 0019 05:00:00 +0000How to make a custom python templating enginehttps://mccue.dev/pages/3-9-19-python-for-html-templates ## Question from Anonymous <!-- Zombie_Pigdragon#3468 --> > I have now used a language I am barely familiar with (Python) (I don't like the scopes) to parse a webpage I built off of online tutorials and it's shitty embeds of python so I could generate a new page that has the current information from Google Calendar and presentation slides from Dropbox, neither of which are APIs I had experience with. I have the shittiest code ever written > > ```html > <div id="SlideshowContainer"> > ${"\n\t\t\t".join(getImageFiles(imageList))} > </div> > ``` > > a small sample of the HTML part. > > But I don't like Python, it's just what I'm required to use > > I have the code execution working, it's basically my own implementation of templates > > Python scopes trip me up for no reason, as well as types > Holy wars of editors don't ever change my opinion; I just use Visual Studio because I normally use C#, but I switched to IDLE for this one, because I was hoping it would be able to work better for Python (it doesn't) What exactly do you mean by scopes though? like, declarations of things? ```python if thing: x = 2 else: x = 3 # x exists here ``` > Declarations in particular, but then I always forget the global keyword in particularly Wait, that's a codesmell. In 5 years of python ive never needed to use the global keyword more than 1 time and even that was a mistake. I've had to use nonlocal for one assignment - but this is indicitive probably of a larger problem in how you write code. > No, Python just feels like shitty scripting IMO and I treat Python like I treat a one shot bash/batch script. (my main interest here is just to get to the root of why you feel this way) (not to knock the case of one off hacky scripts - it's pretty good for that) > But I only ever need Python for either scripting or small programs > > It's only in use here because it's the one language I can remotely use that works on Linux What are the other languages you can use? That might give me some context on what angle you are coming from here. > C# mostly okay, ill loop back around after I have run your code, but I have a hypothesis here. Also, if you have any of your C# code that you can share I am curious how you write code with the guard rails of static typing. > I have code that I think kinda doesn't suck, but it does in retrospect Good enough. So anyways, first I'm walking through your script. First issue ```python data = [] for i in range(0, 10): data.append(("&nbsp", "&nbsp", "&nbsp")) ``` "data" tells me nothing about what this is. Why do you have a list of 10 tuples of 3 non-breaking space html escapes? They also aren't the whole thing you need since you should have a semicolon at the end if I remember correctly `&nbsp;` ```python command = "value = " + match.groups(1)[0] print(command) exec(command, {}, enviormentVariablesInFile) ``` I think you see this coming, but it is worth pointing out anyways. `exec` is never what you want. Part of what is going wrong here is that, beyond using exec as a shorthand for a calculator basically, is that you are assigning to a variable. If anything, at least put the value assignment outside of the command you want to run. But let's look a bit closer at what you are using it for ```python pattern = re.compile("\\${(.*?)}") enviormentVariablesInFile = {"data": data, "imageList": imageList, "value": None} for match in reversed(list(pattern.finditer(output))): span = match.span() command = "value = " + match.groups(1)[0] print(command) exec(command, {}, enviormentVariablesInFile) value = enviormentVariablesInFile["value"] if value is None: value = "" i += 1 output = output[:span[0]] + value + output[span[1]:] ``` The first issue here is `pattern`. Regular expressions are not readable. You need to pick a name for that thing that describes what it does. I can't reverse engineer it, so lets pretend it finds unicorns. ```python unicornPattern = re.compile("\\${(.*?)}") unicornsInOutput = list(unicornPattern.finditer(output)) enviormentVariablesInFile = {"data": data, "imageList": imageList, "value": None} for unicornMatch in reversed(unicornsInOutput): span = match.span() command = "value = " + unicornMatch.groups(1)[0] print(command) exec(command, {}, enviormentVariablesInFile) value = enviormentVariablesInFile["value"] if value is None: value = "" i += 1 output = output[:span[0]] + value + output[span[1]:] ``` Now at least there is a name to the thing. Also, I guess I get the idea. Your output starts as the html file and then you raw exec code in there that you delimited with `${}` and insert it as you go. Because you scan the whole document every time you end up with `n^2` behaviour, but that is fine for your project. But - there has to be a better way (and there is, even sans libraries) If you think of python as only good for you to write in a scripting way you should really be asking yourself: "how exactly would I do things differently in C#?". In this case the main thing c# is going to prevent you from doing is evaluating arbitrary code since "eval"-ing c# is far less easy to do. You even use this eval behaviour to define helper functions within your template ```html ${None; getImageFiles = lambda l: list(map(lambda i: '<img class="SlideshowImage fade" src="' + str(i) + '" />', l))} ``` The key thing here that sucks - which is made more sucky but for a good reason by python requiring whitespace in syntax - is embedding code in templates. Tools like JSP, java server pages, allow for basically arbitrary access to the context of the code around them. This is why, even if you manage things well at the start, projects using things that are that permissive tend to get off the rails. Down a step are things like Jinja (the templating engine flask uses). They allow for you to embed logic - with the conceit that it is sometimes required or helpful for formatting some html or similar , but they do not allow you arbitrary access to the outside scope. This is what I think you are trying to accomplish with your code since you specify exactly the environment for exec and want to then write code to generate stuff using that environment. This works except for the facts that 1. Python is probably too powerful a language to be embedded in a template and 2. Python is a bad fit syntactically for being embedded in html The most simple kind of templating, and what I suggest you use for your project instead of what you are doing, is find and replace. So instead of having your logic be in your template, you compute what you want to put outside of that context and jam it in after the fact without any logic. So for your html that you want to generate - first things first - lets sub out the variable bits. ```html <html> <head> <link href="index.css" rel="stylesheet" type="text/css" /> <script src="index.js"></script> </head> <body> <div id="Container"> <div id="ScheduleContainer"> <table id="ScheduleTable"> <tr class="ScheduleRow"> <th id="ScheduleHeader" colspan="5"> <h1>Schedule</h1> </th> </tr> <tr class="ScheduleRow"> <th colspan="3">Name</th> <th>Time</th> <th>Room</th> </tr> <!-- for scheadule in all_scheadules: make a table row for that scheadule. --> </table> </div> <div id="SlideshowContainer"> <!-- for image in imageList: make an image on the page for that image --> </div> <div id="LogoBox"> <img id="LogoImage" src="logo.png" /> </div> <div id="Footer"> <p style="display: inline" id="DateTimeTime" /> <p style="display: inline" id="DateTimeDate" /> </div> </div> </body> </html> ``` That's all you want to do. Now the question is "how do I fill in the comment blocks without barfing eval-able python code. Your first option is the Jinja approach where your templating language has support for basic looping constructs. You can give it the info and it will format that on the html page. But you are rolling your own, so we will go with the second approach - find and replace. First, lets handle the rows ```python def schedule_row_html(row): return """ <tr class="ScheduleRow"> <td colspan="3" class="ScheduleName">{name}</td> <td class="ScheduleTime">{time}</td> <td class="ScheduleRoomNumber">${room_number}</td> </tr>""".format(name=row["name"], time=row["time"], room_number=row["room_number"]) ``` ```python all_schedule_html = "".join([ schedule_row_html(row) for row in schedules ]) ``` Now we are in a position where we can fill in the rows ```python pageHtml = """ <html> <head> <link href="index.css" rel="stylesheet" type="text/css" /> <script src="index.js"></script> </head> <body> <div id="Container"> <div id="ScheduleContainer"> <table id="ScheduleTable"> <tr class="ScheduleRow"> <th id="ScheduleHeader" colspan="5"> <h1>Schedule</h1> </th> </tr> <tr class="ScheduleRow"> <th colspan="3">Name</th> <th>Time</th> <th>Room</th> </tr> {schedules} </table> </div> <div id="SlideshowContainer"> { images } </div> <div id="LogoBox"> <img id="LogoImage" src="logo.png" /> </div> <div id="Footer"> <p style="display: inline" id="DateTimeTime" /> <p style="display: inline" id="DateTimeDate" /> </div> </div> </body> </html> """ pageHtml.format(schedules=all_schedule_html, images=TBD) ``` ```python def images_html(images): "".join([ "<img class=\"SlideshowImage fade\" src={src}".format(src=imageUrl) for imageUrl in images ]) ``` (This is all pseudocode, so the finer points are up to you) Now you may be asking yourself "but what if my template becomes more complicated?" "just doing string formatting can't scale!" And to that I say - yeah no duh. That's why people spent time writing, improving, and bug-fixing the existing templating libraries. But if your requirements are as simple as you say - a single page regenerated every day or whatever - just do it inline with strings, who cares. Also, tiny thing. ``` ChangeImage ``` The C#/.NET naming convention of every first letter being capitalized isn't used anywhere else. Most javaish people use the `camelCase` thing. Python supports that too, but the generally preferred style is `snake_case`. Doesn't matter for this, but just keep it in the back of your head so when you finally have to code with other programmers you don't get bogged down in pointless holy wars moving on from the exec thing finally: ```python i = 0 for event in events: start = event['start'].get('dateTime') if(start is None): continue start = start[11:16] hour = int(start[0:2]) suffix = " AM" if hour > 12: suffix = " PM" hour %= 12 start = str(hour) + start[2:5] + suffix name = event['summary'] location = event['location'] print(name, start, location, sep=", ") data[i] = (name, start, location) i += 1 ``` What is this `i`? It seems like you are just counting in step with the data because the way you coded it requires a set number of schedules on the page. Hopefully you know how to fix that now and you can just append to data (or whatever name you give it that actually represents what it is). The larger problem with using `i` like this is that it increases the area you need to read over to understand a given chunk of code since you need to track reassignments and changes and uses of `i` everywhere from its first declaration to its last. "`i`", while customary for simple kind of index based for loops from c-ish languages and sometimes when using `range(...)`, really isn't a good enough name here. ```python start = event['start'].get('dateTime') if(start is None): continue start = start[11:16] hour = int(start[0:2]) suffix = " AM" if hour > 12: suffix = " PM" hour %= 12 ``` Also, date handling logic is always going to be messy. I get that. But try and make it depend on less magic numbers. Maybe isolate it to its own function (maybe, depending on if that helps or hurts readability in context) What is `start[11:16]`? Maybe you know now, but god only knows a year from now. `getService` you copy-pasted. No problem, but maybe put a link back to where you copy-pasted it from in case you need to change it later. I would loop back around to tackling your misgivings about python but at this point im tired > Okay, sorry, I had to go halfway through this, and just got back > > Thanks for all the help! > > I'll try to fix some of the worse problems in this > > You're right, I'm not respecting the language correctly I wouldn't use the verb "Respecting" necessarily. You just need to learn how to write code to be read. I think being in python lowers some guardrails, so you are just bumping into stuff more. Sat, 09 Mar 0019 05:00:00 +0000Can you explain mallochttps://mccue.dev/pages/3-8-19-explain-malloc ## Question from a Deleted User > could you explain what's happening here > ``` > (char *) malloc(n * sizeof(char)); > ``` The malloc thing allocates the structure on the heap. Think of it like this: Anything you declare like char[] without calling malloc is memory that is released as soon as the function returns. Any memory you get by using malloc is never released unless you explicitly call free on it. I know it's confusing, but you'll figure it out. > and what does (char *) do here The (char *) bit does explicitly casts the pointer in the eyes of the compiler. It is a pointer to the start of memory that contains god knows what. If you allocated `sizeof char` then you would have a pointer to a block of memory that is the size of a single `char`, which would be equivalent to an array of characters of size 1. > does C have array index oob error? I remember it took me days to find out that I've made a mistake while looping the array No and yes. Yes that can cause your program to segfault or maybe crash in some way. No you don't get a named error that tells you what you messed up. If you want to get some output it might be educational to make your own "data structure" and `printf` or something if an error happens. If you want C perf and sensible errors try Rust, but my Spidey Senses tell me you are still learning in general so that's not really a productive jump. Fri, 08 Mar 0019 05:00:00 +0000What is the builder pattern forhttps://mccue.dev/pages/3-7-19-what-is-the-builder-pattern-for ## Question from Abdul#4709 > Stupid assignment requirement for using super method inside constructor > > Just wanted to make sure prompt makes sense > ```java > System.out.print("Enter 'y' or 'n' if the triangle is filled: "); > char e = scanner.next().charAt(0); > boolean f = e == 'y' ? true : false; > Triangle triangle = new Triangle(a, b, c, d, f); > ``` > It's a small class so jus using var names like this Once a constructor gets to 5 parameters it is probably time to switch to using the builder pattern. It's boilerplate code in java, but it is a must for readability. Can you share your triangle class as it is right now? > Never seen builder pattern before I will make sure to look it up > > &lt; CODE LOST TO TIME &gt; ```java protected Date dateCreated; ``` Thats...wierd. Why is this info stored? > I actually didn't make the Geometric object class it was posted by the prof > > I made the other Triangle one Man your prof is annoying > His lectures r even worse Well, making do with what you have is...workable, but we can revisit that superclass to see how you would design it if you weren't a bored college professor. Basically, for most uses the builder pattern is just a substitute for a language feature called named optional parameters Consider this python ```python class Position: def __init__(self, *, x=0, y=0, z=0): self.x = x self.y = y self.z = z ``` > What's the second parameter in that 🤔 It is a python shorthand saying "these things need to be named". That's not really the focus though. With the constructor (init method - close enough for now) being written like this you can call it a bunch of different ways. ```python Position() # makes a 0, 0, 0 Position(x=1) # makes a 1, 0, 0 Position(z=4) # makes a 0, 0, 4 Position(y=2, z=4) # makes a 0, 2, 4 Position(x=1, y=2, z=4) # makes a 1, 2, 3 ``` This has alot of cool benefits > Oh so you don't have to define different constructors many times unlike in java For one, if a field has a sensible default value (like all of these do in the position case) you can just insert that if it isn't specified. Also, the parameters being named means that you can specify them "out of order" with the method/function definition, which is very important if you have more than 3 parameters. "What does the 6th int mean" is a stupid question to have to ask yourself, not to mention the chance you get it wrong, so having the parameters named puts the name of the thing right next to the value. And before I get to explaining the builder pattern (your way of hacking this language feature into java), try and consider how you would support the Position example with just overloading methods. > Aight > > Perhaps having one method for just x and y and another one for x, y, z? What if I want to specify just y and z? > That would require you to make another method > > Which is repetition > > And we just have three fields in a larger class there will be even more Not only that, it wouldn't work. Remember, if you have two methods with the same name (or constructors for that matter) then java needs to be able to tell them apart by the types of their arguments. So you can't have two constructors which both take two ints. ```java // As far as the compiler can tell, these are identical Position(int x, int y) { ... } Position(int y, int z) { ... } ``` You would have to start making static factory methods with different names for each case. ```java .positionYZ(...) .positionXZ(...) ``` > Ah shite true And, while the default values piece of this is important, there will be times where mandating that the user name out all of the parameters, even if there are no optional parameters, makes it so your code is actually legible in the face of dozens of properties. So without further ado - the builder pattern. First, we will start with your triangle code ```java public class Triangle extends GeometricObject { private double a; // Side one. private double b; // Side two. private double c; // Side three. public Triangle() { this.setA(1.0); this.setB(1.0); this.setC(1.0); } public Triangle(double a, double b, double c, String d, boolean e) { super(d, e); this.setA(a); this.setB(b); this.setC(c); } public double getA() { return a; } public void setA(double a) { this.a = a; } public double getB() { return b; } public void setB(double b) { this.b = b; } public double getC() { return c; } public void setC(double c) { this.c = c; } public double getArea() { double p = (this.a + this.b + this.c) / 2; return Math.sqrt(p * (p - this.a) * (p - this.b) * (p - this.c)); } public double getPerimeter() { return this.a + this.b + this.c; } @Override public String toString() { return "Triangle{" + "a=" + a + ", b=" + b + ", c=" + c + ", area=" + this.getArea() + ", perimeter=" + this.getPerimeter() + ", color='" + color + '\'' + ", filled=" + filled + '}'; } } ``` > (kenndel#7506) Java is a beautiful language > ```java public class Triangle extends GeometricObject { private double a; // Side one. private double b; // Side two. private double c; // Side three. public Triangle() { this.setA(1.0); this.setB(1.0); this.setC(1.0); } public Triangle(double a, double b, double c, String d, boolean e) { super(d, e); this.setA(a); this.setB(b); this.setC(c); } public double getA() { return a; } public double getB() { return b; } public double getC() { return c; } public double getArea() { double p = (this.a + this.b + this.c) / 2; return Math.sqrt(p * (p - this.a) * (p - this.b) * (p - this.c)); } public double getPerimeter() { return this.a + this.b + this.c; } @Override public String toString() { return "Triangle{" + "a=" + a + ", b=" + b + ", c=" + c + ", area=" + this.getArea() + ", perimeter=" + this.getPerimeter() + ", color='" + color + '\'' + ", filled=" + filled + '}'; } } ``` Now lets give things more descriptive names and let's also get rid of that default constructor for now. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. public Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } public double getSideA() { return this.sideA; } public double getSideB() { return this.sideB; } public double getSideC() { return this.sideC; } public double getArea() { double p = (this.sideA + this.sideA + this.sideC) / 2; return Math.sqrt(p * (p - this.sideA) * (p - this.sideB) * (p - this.sideC)); } public double getPerimeter() { return this.sideA + this.sideB + this.sideC; } @Override public String toString() { return "Triangle{" + "a=" + this.sideA + ", b=" + this.sideB + ", c=" + this.sideC + ", area=" + this.getArea() + ", perimeter=" + this.getPerimeter() + ", color='" + color + '\'' + ", filled=" + filled + '}'; } } ``` And for now on, I am going to leave out all of the methods for space The first thing we want to do is make the constructor private, since we are going to be replacing the access pattern of "calling the constructor with all the arguments" with our builder. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. private Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } } ``` > Never seen a private constructor before A private constructor can only be called from the class it is defined and stuff within that class, So lets make the stuff that goes in the class. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. private Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } public static class Builder {} } ``` We add a builder class within the triangle class. This builder can access all of the private methods of the Triangle class because it is in the Triangle class. Now this builder needs to keep track of all of the information needed to build a triangle. We also want that information to be added one piece at a time - one method call at a time. > Does it matter if Builder class is static or non static Yep. Thats a confusing java specific thing - feel free to google why that is needed. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. private Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } public static class Builder { private double sideA; private double sideB; private double sideC; private String color; private boolean filled; public Builder() { // Put any defaults here. If there isn't any default set it to null and check for that later this.sideA = 1.0; this.sideB = 1.0; this.sideC = 1.0; this.color = "black"; this.filled = true; } public void setSideA(double sideA) { this.sideA = sideA; } public void setSideB(double sideB) { this.sideB = sideB; } public void setSideC(double sideC) { this.sideC = sideC; } public void setColor(String color) { this.color = color; } public void setFilled(boolean filled) { this.filled = filled; } } } ``` Now you have the ability to mutate the builder for whatever you want. > Wait, you have to redeclare the fields inside builder class? usually, yeah. It is far from a perfect system. It is bad code to write, but it provides the nicest possible outward facing interface. > So when you're mutating the fields inside builder does it update them in the main class also or just the builder class Just the builder class. No instance of the outer class exists yet. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. private Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } public static class Builder { private double sideA; private double sideB; private double sideC; private String color; private boolean filled; public Builder() { // Put any defaults here. If there isn't any default set it to null and check for that later this.sideA = 1.0; this.sideB = 1.0; this.sideC = 1.0; this.color = "black"; this.filled = true; } public void setSideA(double sideA) { this.sideA = sideA; } public void setSideB(double sideB) { this.sideB = sideB; } public void setSideC(double sideC) { this.sideC = sideC; } public void setSideB(String color) { this.color = color; } public void setFilled(boolean filled) { this.filled = filled; } } } ``` No change yet, I just cleaned up the above code. So now that we can set the properties on the builder we have to add a method for actually constructing the Triangle. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. private Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } public static class Builder { private double sideA; private double sideB; private double sideC; private String color; private boolean filled; public Builder() { // Put any defaults here. If there isn't any default set it to null and check for that later this.sideA = 1.0; this.sideB = 1.0; this.sideC = 1.0; this.color = "black"; this.filled = true; } public void setSideA(double sideA) { this.sideA = sideA; } public void setSideB(double sideB) { this.sideB = sideB; } public void setSideC(double sideC) { this.sideC = sideC; } public void setColor(String color) { this.color = color; } public void setFilled(boolean filled) { this.filled = filled; } public Triangle build() { // You still need to remember the order, but it is only in one place at least // This is also the place you should put any validations. Checking if any required properties are not set, things are // set as null that should not be null, etc. return new Triangle(this.sideA, this.sideB, this.sideC, this.color, this.filled); } } } ``` now, as it is written now, you would end up using the builder like this ```java Triangle.Builder builder = new Triangle.Builder(); builder.setSideA(...); builder.setSideB(...); builder.setSideC(...); builder.setColor(...); builder.setFilled(...); Triangle triangle = builder.build(); ``` One way to make it a bit easier to use the builder is to make each setter return a reference to the builder itself. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. private Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } public static class Builder { private double sideA; private double sideB; private double sideC; private String color; private boolean filled; public Builder() { // Put any defaults here. If there isn't any default set it to null and check for that later this.sideA = 1.0; this.sideB = 1.0; this.sideC = 1.0; this.color = "black"; this.filled = true; } public Builder setSideA(double sideA) { this.sideA = sideA; return this; } public Builder setSideB(double sideB) { this.sideB = sideB; return this; } public Builder setSideC(double sideC) { this.sideC = sideC; return this; } public Builder setColor(String color) { this.color = color; return this; } public Builder setFilled(boolean filled) { this.filled = filled; return this; } public Triangle build() { return new Triangle(this.sideA, this.sideB, this.sideC, this.color, this.filled); } } } ``` This lets you "chain" the method calls like this ```java Triangle triangle = new Triangle.Builder() .setSideA(...) .setSideB(...) .setSideC(...) .setColor(...) .setFilled(...) .build(); ``` So that is the basic "pattern". From here on out its kinda all preference and style. Personally, I don't like the `setThing` naming with builders, so I usually choose to just use the field name. In a builder its kinda understood that you are setting things. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. private Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } public static class Builder { private double sideA; private double sideB; private double sideC; private String color; private boolean filled; public Builder() { // Put any defaults here. If there isn't any default set it to null and check for that later this.sideA = 1.0; this.sideB = 1.0; this.sideC = 1.0; this.color = "black"; this.filled = true; } public Builder sideA(double sideA) { this.sideA = sideA; return this; } public Builder sideB(double sideB) { this.sideB = sideB; return this; } public Builder sideC(double sideC) { this.sideC = sideC; return this; } public Builder color(String color) { this.color = color; return this; } public Builder filled(boolean filled) { this.filled = filled; return this; } public Triangle build() { return new Triangle(this.sideA, this.sideB, this.sideC, this.color, this.filled); } } } ``` ```java Triangle triangle = new Triangle.Builder() .sideA(...) .sideB(...) .sideC(...) .color(...) .filled(...) .build(); ``` The other thing that is useful is to make the actual builder not constructable but instead give out an instance via a static method. ```java public class Triangle extends GeometricObject { private double sideA; // Side one. private double sideB; // Side two. private double sideC; // Side three. private Triangle(double sideA, double sideB, double sideC, String color, boolean filled) { super(color, filled); this.sideA = sideA; this.sideB = sideB; this.sideC = sideC; } // You get a new builder via Triangle.builder() public static Builder builder() { return new Builder(); } public static class Builder { private double sideA; private double sideB; private double sideC; private String color; private boolean filled; // And now this is private so your users don't construct it directly private Builder() { this.sideA = 1.0; this.sideB = 1.0; this.sideC = 1.0; this.color = "black"; this.filled = true; } public Builder sideA(double sideA) { this.sideA = sideA; return this; } public Builder sideB(double sideB) { this.sideB = sideB; return this; } public Builder sideC(double sideC) { this.sideC = sideC; return this; } public Builder color(String color) { this.color = color; return this; } public Builder filled(boolean filled) { this.filled = filled; return this; } public Triangle build() { return new Triangle(this.sideA, this.sideB, this.sideC, this.color, this.filled); } } } ``` ```java Triangle triangle = Triangle.builder() .sideA(...) .sideB(...) .sideC(...) .color(...) .filled(...) .build(); ``` I don't have a real rock solid argument for the `.builder()` static method other than it looks better, but I think I might have in the past. Thu, 07 Mar 0019 05:00:00 +0000How to print every item in a listhttps://mccue.dev/pages/3-7-19-how-to-print-elements-of-list > How can I get items inside a list to print into the terminal individually in Python > > List = ["cat", "dog", "snake"] > > I want them to print to the terminal on a separate line like > > Cat > > Dog > > Snake > > Oh I googled it > > I did google it before but I didn't phrase it correctly > > ```python > for x in list: > print(x) > ``` I will save you some time in the future then for other data structures ```python # list x = [1, 2, 3] for thing in x: print(thing) # dictionary x = { "a": "apple", "b": "banana" } ## For everything here, don't rely on the order you get them in being the same always for key in x: print(key) # will give you "a" and "b" for key in x.keys(): print(key) # identical behaviour to the above, but a bit more explicit for value in x.values(): print(value) # Will give you "apple" and "banana" for key, value in x.items(): print(key) # Will give you the pairs ("a", "apple") and ("b", "banana") print(value) "a" in x # Will evaluate to True "apple" in x # Will evaluate to False. "in" with a dictionary checks keys. # sets x = { 1, 2, 3 } for thing in x: print(thing) # Will give you 1, 2, and 3, but just like dicts dont rely on the order ## A set is the data structure you should use if you want a list of things where it wouldn't matter ## how many times something is in the list and you don't care about the ordering of those things 1 in x # Will evaluate to True 4 in x # Will evaluate to False ## Small note, make an empty set by calling set(), not by using empty braces {} - That will create an empty dictionary ``` Thu, 07 Mar 0019 05:00:00 +0000Bad reasons to avoid NodeJShttps://mccue.dev/pages/3-6-19-you-shouldnt-avoid-node-because-of-speed ## Conversation with _diamondburned_#4507 > you shouldn't ever use nodejs anyway > > dependencies > > slow performance > > shit language was never designed for backend anywa > > 3.00000000000000000000000003 > > shit libraries > > bloated > > each line is a reason btw > > even when not activating, it's still slower than compiled languages Speed is a nonsensical reason to not use node. JS vms are pretty fast and the kind of applications that use node are mostly IO bound, which V8's async system is pretty good at. > It is for me when it comes to restarting a service for maintenance reason or for complex code Thats not how you should be running servers to be honest. "I won't get downtime because I can restart a server ultra fast" really isn't a sensible plan, even if your server is the fastest thing in the world written in rust Either you architect your devops to allow zero downtime (multiple nodes, cycling restarts, multi DC maybe) or you live with having to do maintenance on/off hours. Startup time of node has no impact on that. As always basically everything has an exception at Google scale and it's not like there aren't valid potshots to take at Node and Javascript, but as a platform it really does excel at IO heavy tasks. > wdym heavy IO tasks? Well, for example, the majority of the time a webserver spends handling a request is spent in IO. Waiting for a database to respond, sending data to the user, receiving a request. Maybe it's a websocket thing and you receive from one user and broadcast to others. Either way, the time spent handling that almost always dominates the amount of time spent actually running code in the js thread. The model of "okay, here is this IO task - give it to another thread and tell me what the results are when it's done" works out pretty well performance wise. Also, the cold startup for node is only really bad if you compare it to Rust, C, and C++ which isn't really fair. https://medium.com/@nathan.malishev/lambda-cold-starts-language-comparison-%EF%B8%8F-a4f4b5f16a62 C and C++ require pretty huge tradeoffs in terms of developer speed and possibility of errors, so they really aren't targeted at the same users as node and rust - well rust is a lot better but being super statically typed still has tradeoffs with how quick you can change things which means it won't be the right choice for every "webserver" use case. > > okay, here is this IO task - give it to another thread and tell me what the results are when it's done > > so bringing go into the equation, go routine workers Yeah I think go's async model is better too. But go isn't that monumentally faster than node that you can say "forget node it's too slow" because it really isn't. Neither is python for that matter. Though python's async and package management story are a lot worse , it is still the reigning king of glue languages. Some critical path too slow? Rewrite it in Rust, join the meme. Python will keep on chugging. I mean, the GIL stinks, but it's still not "painfully" slow. Wed, 06 Mar 0019 05:00:00 +0000How to display execution timeshttps://mccue.dev/pages/3-1-19-how-to-display-execution-times ## Question from Abdul#4709 > Stuck on LinkedLists > > Anyone knows how to display execution times Probably out of scope for your assignment, but you can always use something like this ```java private void timeProcedure(Runnable procedure) { long start = System.currentTimeMillis(); procedure.run(); long end = System.currentTimeMillis(); long duration = end - start; System.out.println(duration + " milliseconds."); } ``` and then you can do something like ```java timeProcedure(() -> linkedList.get(1000000 / 2); ``` You can also add the name to have better debugging ```java private void timeProcedure(String procedureName, Runnable procedure) { long start = System.currentTimeMillis(); procedure.run(); long end = System.currentTimeMillis(); long duration = end - start; System.out.println(procedureName + " took " + duration + " milliseconds."); } ``` ```java timeProcedure("Accessing an element in the middle of a linked list", () -> linkedList.get(100000 / 2)) ``` which would output something like ``` Accessing an element in the middle of a linked list took 33 milliseconds ``` Fri, 01 Mar 0019 05:00:00 +0000MVC Originhttps://mccue.dev/pages/2-17-19-mvc-origin So, tiny history lesson. The phrase and acronym "MVC" came out of this: http://heim.ifi.uio.no/~trygver/themes/mvc/mvc-index.html I don't think that's the original memo, but it was just one person in the late 70s who hypothesised that it would be a good model. And it turns out that yes, separating your concerns like this does in fact make your code easier to write and maintain generally. but for a web server beyond a small project I think you will find that it all doesn't fit so cleanly into exactly 3 buckets M, V, and C. The django project uses MV* and I think that gets to the heart of it. Yes, there is a part of your code involved with storing and modeling your data and yes, there is a part of your code involved with displaying a view but beyond that its... fuzzy. Sun, 17 Feb 0019 05:00:00 +0000Is it okay to have a class with just methodshttps://mccue.dev/pages/2-16-19-is-it-okay-to-have-a-class-with-just-methods ## Question from davidv7#2315 > would it be ok to just have a class with methods, no class parameters? > > aka > > ```java > class Thing{ > public Thing(){ > > } > method1(args..) > method2(args...) > } > ``` > in java So specifically if you want to do that you have a few considerations. Java kinda obscures this, but first consider "Does this method perform a side effect" meaning, does it read from a db, write to a db, print something out, load a file, etc. If the answer is NO then you can make a "Utils" class ```java class Utils { private Utils() {} // We don't need to construct this class so just disallow it // This is a pure function so it is perfectly fine to put it into a helper method like this public static absoluteValuePlusOne(int num) { return Math.abs(num) + 1; } } ``` If the answer is YES then there is some value in a class with no parameters. Namely, that class can implement an interface. Lets say this was your behaviour ```java MyDataSource dataSource = new MyDataSource(); List<OrderItem> items = dataSource.getItems(); ``` If you wanted to make that behaviour customizable you can put that in an interface. (IF being a word I want you to notice: When you "abstract" you make code a little harder to understand so always consider if you actually want that) ```java public interface DataSource { // Does a thing List<OrderItem> getItems(); } ``` And use a class with an empty constructor to implement it ```java public class MyDataSource implements DataSource { public MyDataSource() {} public List<OrderItem> getItems() { // ... Some implementation goes here ... } } ``` Which provides you the benefit of being able to swap in how that behaviour is done later on or in different places in your code ```java DataSource dataSource = new MyDataSource(); // Can be any implementation List<OrderItem> items = dataSource.getItems(); ``` ```java public someMethodInSomeClass(DataSource dataSource, String userId) { // ... can do some hoopla and not have to be changed if your method of retrieving data changes // (the example is kinda contrived, I know, but basically every programming example is) } ``` Using an interface in key places also has implications for how you test your code, but I won't get into that now. Sat, 16 Feb 0019 05:00:00 +0000How do you make a fixed size circular bufferhttps://mccue.dev/pages/1-28-19-circular-buffer ## Question from JohnDoe#9991 > How would you have an object with 2 values that'd work like this > > &gt;(0,1) > > &gt;add(2) > > (1,2) > > &gt;add(3) > > (2,3) > > &gt;add(10) > > (3,10) > > &gt;add(50) > > (10, 50) > > etc > > ie replacing the oldest value > > An ordered array with unlimited size and removing the oldest value when something is added would do but that'd be ugly In general you can do that with a "circular buffer" Here you go. ```python class CircularBuffer: def __init__(self, size=2): self._buffer = [] self._size = size self._index = 0 def add(self, item): # This follows your "add" if len(self._buffer) < self._size: self._buffer.append(item) else: self._buffer[self._index] = item self._index = (self._index + 1) % self._size def __len__(self): return len(self._buffer) def __iter__(self): for item in self._buffer: yield item def __getitem__(self, key): return self._buffer[key] def __repr__(self): return f"CircularBuffer (buffer={self._buffer}, size={self._size}, index={self._index})" c = CircularBuffer(size=2) print(c) c.add(0) print(c) c.add(1) print(c) c.add(2) print(c) c.add(3) print(c) c.add(10) print(c) c.add(50) print(c) print(c[0]) print(c[1]) print(len(c)) print(list(c)) ``` ``` CircularBuffer (buffer=[], size=2, index=0) CircularBuffer (buffer=[0], size=2, index=1) CircularBuffer (buffer=[0, 1], size=2, index=0) CircularBuffer (buffer=[2, 1], size=2, index=1) CircularBuffer (buffer=[2, 3], size=2, index=0) CircularBuffer (buffer=[10, 3], size=2, index=1) CircularBuffer (buffer=[10, 50], size=2, index=0) 10 50 2 [10, 50] ``` This is a mutable implementation. If you want an immutable implementation I can write that up as well, but it will take a bit more time. The key is the modulo operator. We track the "last inserted element" by always storing it to the right of the last element we stored, meaning if I last inserted something at index 0, the next thing i will insert at index 1. If I reach the "end" of the list or array then I need to "circle" around back to the first element at index 0. To keep track of this information we remember an integer representing the last index we inserted at, then each time we add something to the list we increment the integer to "move to the right" and to make it "loop around" we use the modulo operator. "modulo" just means the remainder you would get when dividing two numbers. For example `15 % 4` is `3` This is because we can fit three `4`s into `15` (`4 * 3 = 12`), but we don't have enough to fit another four so we have a "remainder" of `3` (`15 - 12 = 3`) When you repeatedly do `n = (n + 1) % some_limit` then the value of `n` will go up to `1` minus the limit before looping back around to zero. Try these on paper to get a better idea of why ``` 0 % 3 = ? 1 % 3 = ? 2 % 3 = ? 3 % 3 = ? 4 % 3 = ? 5 % 3 = ? 6 % 3 = ? 7 % 3 = ? ``` For your specific case of just 2 items the modulo stuff is kinda overkill (from a cognitive load standpoint) so just do whatever solves your problem the simplest. But if you wanted an object that holds max n elements that works like you described (where old elements are replaced in place) this is probably the cleanest way to do that. If the "where" in the list doesn't matter then maybe consider using a deque with a size cap Mon, 28 Jan 0019 05:00:00 +0000