You are here

Planet Ubuntu

Subscribe to Feed Planet Ubuntu
Planet Ubuntu - http://planet.ubuntu.com/
Përditësimi: 5 ditë 1 orë më parë

Corey Bryant: OpenStack 2023.1 Antelope for Ubuntu 22.04 LTS

Mër, 29/03/2023 - 9:30md

The Ubuntu OpenStack team at Canonical is pleased to announce the general availability of OpenStack 2023.1 Antelope on Ubuntu 22.04 LTS (Jammy Jellyfish).

Details of the Antelope release

Ubuntu 22.04 LTS

The Ubuntu Cloud Archive for OpenStack 2023.1 Antelope can be enabled on Ubuntu 22.04 by running the following command:
sudo add-apt-repository cloud-archive:antelope

The Ubuntu Cloud Archive for 2023.1 Antelope includes updates for:

aodh, barbican, ceilometer, cinder, designate, designate-dashboard, dpdk (22.11.1), glance, gnocchi, heat, heat-dashboard, horizon, ironic, ironic-ui, keystone, magnum, magnum-ui, manila, manila-ui, masakari, mistral, murano, murano-dashboard, networking-arista, networking-bagpipe, networking-baremetal, networking-bgpvpn, networking-hyperv, networking-l2gw, networking-mlnx, networking-odl, networking-sfc, neutron, neutron-dynamic-routing, neutron-fwaas, neutron-taas, neutron-vpnaas, nova, octavia, octavia-dashboard, openstack-trove, openvswitch (3.1.0), ovn (23.03.0), ovn-octavia-provider, placement, sahara, sahara-dashboard, senlin, swift, trove-dashboard, vitrage, watcher, watcher-dashboard, zaqar, and zaqar-ui.

For a full list of packages and versions, please refer to the Antelope version report.

Reporting bugs

If you have any issues please report bugs using the ‘ubuntu-bug’ tool to
ensure that bugs get logged in the right place in Launchpad:

sudo ubuntu-bug nova-conductor

Thank you to everyone who contributed to OpenStack 2023.1 Antelope!

Corey
(on behalf of the Ubuntu OpenStack Engineering team)

Bryan Quigley: Email list to Ghost site - PastaDollar.com

Hën, 27/03/2023 - 1:50pd

My brother, Nick, just launched a new website/newsletter using Ghost. Ghost is pretty unique in this space, it is:

  • Open Source
  • Backed by a Non-Profit
  • A web publishing platform
  • A newsletter publishing platform
  • A CMS for static sites
  • A member subscription / Patreon replacement

Here is Nick's experience with Ghost so far...

Pricing and Publishers

From a pricing perspective, the platform scales with your audience size. Importantly, Ghost does not take a cut of your membership subscription revenue. For this reason, the pricing incentives mean your Ghost site is more cost-effective when you have 100% paid members and 0% free members. I'm sure that ratio is a pipe dream for most publications, but that's how the business model works. Pricing is currently $9/mo for any site with 500 or fewer subscribers, including web hosting. For context, the new blog that I started in 2023 using Ghost is www.pastadollar.com. The site was originally an email newsletter from my generation of 14 cousins investing together. If you're looking for finance tips and travel hacks, Subscribe here! An email-list-to-ghost-site is a common path for Ghost users. I'm just starting, but major publications like The Atlantic, The Lever, and The Browser use Ghost.

Positives

Ghost has only a handful of themes, but on the whole, they are super clean and approachable. Picking one is as much about your content as it is about which one you list best. All of the themes are responsive and have all the standard customization options like colors and logos. Notably, there is no simple way to use custom fonts. Instead, you can choose between a serif font and a sans-serif font. There is a workaround, but the need for a long list of fonts is a relatively notable omission from a CMS or theme these days. The Ghost editor has a learning curve, but on the whole, it is excellent. I'm used to WordPress's editors, which are total crap by comparison. There are several Ghost editor keyboard shortcuts worth looking up, such as Shift+Enter to jump to the following line but to remain in the same editor block. Hitting Enter will jump to a new block, skipping more lines. The editor auto-saves your work as you go, just as you would expect any web-based tool to do these days. The saving experience is much like google docs - automatic and out of your way. However, sometimes when I haven't changed anything, a modal pops up asking me if I'm certain I want to navigate away without saving—maybe just a bug, but a bit annoying. Many tasks that take multiple steps in a WordPress site are automatic in Ghost. For example, when you add an image to a post or page, Ghost will optimize it. There is no more need to use a plugin or a site like tinypng.com to optimize the image in advance because the platform doesn't do it for you. Ghost was designed with plenty of best practices. For example, the default post URL is the post's title, without any year or dates included. Including dates is possible on other platforms but can quickly cause 404s and redirection issues if/when you update a post. The Ghost developers took the best practice, implemented it as default, and left it to you to simply create the content. It's great.

Negatives Plugins

Ghost has a great library of plugins, which they accurately call 'integrations.' The catch is that most of these are integrations with other major companies' SaaS products, meaning many of them are paid solutions. The company Ghost seems to rely on the most to expand its functionality is Zapier. Half of the help and support articles I've read on Ghost seem to using Zapier to solve a problem. I don't think it's a stretch to say that Ghost relies heavily on Zapier, a freemium SaaS product. So basically, once you set it up and grow your site, it will eat into your bottom line. That isn't necessarily bad, provided it's worth it. But it seems like the solutions to expand your Ghost site are mostly paid solutions rather than something custom via code injection, which Ghost also supports. To be fair, I need to explore code injection solutions more than I have as of this writing.

Quality of Life Issues

URLs are a big part of any site, so I'm amazed this is an issue: When creating a hyperlink, there isn't proper data validation on the link. If you omit the 'https://www' from a new hyperlink, the Ghost editor saves it, but the URL will not actually work. Saving a link like "pastadollar.com" will show as a broken link and not work when you publish the post. This behavior is incredibly annoying because when you're in your site admin, many URLs around the dashboard omit the https://. To ensure your internal URLs will work, I've solved this by always copying them from the live site. Bullets and Numbered lists aren't an easy option I've found when using the editor. I finally found them hidden in an editor block called markdown editor card. The markdown block supports some rich text features that literally anyone who uses a computer is accustomed to. It isn't exactly user-friendly to hide them in a block, but I'm sure there was a good reason for doing so. Another simple workaround I used when I started was to simply copy and paste a bulleted or numbered list from a proper text editor. I'm guessing many writers create whole drafts outside the Ghost editor anyway.

Alternatives

Alternativeto.net lists more than 250+ alternatives to Ghost, but that long list includes broader solutions like WordPress. Like probably anyone who has developed a website in their lives, I've used WordPress (and continue to use it to this day). WordPress, by default, needs a lot of help from its vast plugin library to set it up, similar to how Ghost works out of the box. I don't consider WordPress a proper alternative if your goal is primarily an email newsletter and membership site. Ghost is absolutely worth the cost over WordPress in that regard - it's not even close.

Beehiiv

In my opinion, the most similar and direct competitor to Ghost is a platform called Beehiiv. Beehiiv has less favorable pricing, and the themes and designs are uglier, but it boasts a stronger feature set out of the box. For example, two big ones are subscriber referral rewards for sharing with friends and more customizable email templates. To be clear, you can probably add these features to a Ghost site; it just takes some work. Both Ghost and Beehiiv are designed to scale quickly with your audience size.

Conclusion

Ghost support has been both highly responsive and extremely helpful. For example, I reached out about getting custom domain emails, such as support@mydomain.com, and their guidance was perfect. Their < 24-hour replies led me to many paid and free services and cautioned me on a few 'gotchas' others ran into. These were not canned responses, and I feel like I can go back to them for anything, and we will be able to find a solution. Overall, I love Ghost. The admin interface is spotless and fast. I'm coming primarily from WordPress, and the speed difference is a game-changer. I would work from many tabs simultaneously in WordPress because switching between different interface pages took too many seconds to load. The Ghost admin pages usually take less than a second to load.

Comments

Add a comment via Gitlab. And yes, Ghost has much nicer integrated commentng built-in.

Balint Reczey: Building the Linux kernel in under 10 seconds with Firebuild

Mar, 21/03/2023 - 9:54pd

Russell published an interesting post about his first experience with Firebuild accelerating refpolicy‘s and the Linux kernel‘s build. It turned out a few small tweaks could accelerate the builds even more, crossing the 10 second barrier with Linux’s build.

Build performance with 18 cores

The Linux kernel’s build time is a widely used benchmark for compilers, making it a prime candidate to test a build accelerator as well. In the first run on Russell’s 18 core test system the observed user+sys CPU time was cut by 44% with an actual increase in wall clock time which was quite unusual. Firebuild performed much better than that in prior tests. To replicate the results I’ve set up a clean Debian Bookworm VM on my machine:

lxc launch images:debian/bookworm –vm -c limits.cpu=18 -c limits.memory=16GB bookworm-vm

Compiling Linux 6.1.10 in this clean Debian VM showed build times closer to what I expected to see, ~72% less wall clock time and ~97% less user+sys CPU time:

$ make defconfig && time make bzImage -j18 real 1m31.157s user 20m54.256s sys 2m25.986s $ make defconfig && time firebuild make bzImage -j18 # first run: real 2m3.948s user 21m28.845s sys 4m16.526s # second run real 0m25.783s user 0m56.618s sys 0m21.622s

There are multiple differences between Russell’s and my test system including having different CPUs (E5-2696v3 vs. virtualized Ryzen 5900X) and different file systems (BTRFS RAID-1 vs ext4), but I don’t think those could explain the observed mismatch in performance. The difference may be worth further analysis, but let’s get back to squeezing out more performance from Firebuild.

Firebuild was developed on Ubuntu. I was wondering if Firebuild was faster there, but I got only slightly better build times in an identical VM running Ubuntu 22.10 (Kinetic Kudu):

$ make defconfig && time make bzImage -j18 real 1m31.130s user 20m52.930s sys 2m12.294s $ make defconfig && time firebuild make bzImage -j18 # first run: real 2m3.274s user 21m18.810s sys 3m45.351s # second run real 0m25.532s user 0m53.087s sys 0m18.578s

The KVM virtualization certainly introduces an overhead, thus builds must be faster in LXC containers. Indeed, all builds are faster by a few percents:

$ lxc launch ubuntu:kinetic kinetic-container ... $ make defconfig && time make bzImage -j18 real 1m27.462s user 20m25.190s sys 2m13.014s $ make defconfig && time firebuild make bzImage -j18 # first run: real 1m53.253s user 21m42.730s sys 3m41.067s # second run real 0m24.702s user 0m49.120s sys 0m16.840s # Cache size: 1.85 GB

Apparently this ~72% reduction in wall clock time is what one should expect by simply prefixing the build command with firebuild on a similar configuration, but we should not stop here. Firebuild does not accelerate quicker commands by default to save cache space. This howto suggests letting firebuild accelerate all commands including even "sh” by passing "-o 'processes.skip_cache = []'” to firebuild.

Accelerating all commands in this build’s case increases cache size by only 9%, and increases the wall clock time saving to 91%, not only making the build more than 10X faster, but finishing it in less than 8 seconds, which may be a new world record!:

$ make defconfig && time firebuild -o 'processes.skip_cache = []' make bzImage -j18 # first run: real 1m54.937s user 21m35.988s sys 3m42.335s # second run real 0m7.861s user 0m15.332s sys 0m7.707s # Cache size: 2.02 GB

There are even faster CPUs on the market than this 5900X. If you happen to have access to one please leave a comment if you could go below 5 seconds!

Scaling to higher core counts and comparison with ccache

Russell raised the very valid point about Firebuild’s single threaded supervisor being a bottleneck on high core systems and comparison to ccache also came up in comments. Since ccache does not have a central supervisor it could scale better with more cores, but let’s see if ccache could go below 10 seconds with the build times…

firebuild -o ‘processes.skip_cache = []’ and ccache scaling to 24 cores

Well, no. The best time time for ccache is 18.81s, with -j24. Both firebuild and ccache keep gaining from extra cores up to 8 cores, but beyond that the wall clock time improvements diminish. The more interesting difference is that firebuild‘s user and sys time is basically constant from -j1 to -j24 similarly to ccache‘s user time, but ccache‘s sys time increases linearly or exponentially with the number of used cores. I suspect this is due to the many parallel ccache processes performing file operations to check if cache entries could be reused, while in firebuild’s case the supervisor performs most of that work – not requiring in-kernel synchronization across multiple cores.

It is true, that the single threaded firebuild supervisor is a bottleneck, but the supervisor also implements a central filesystem cache, thus checking if a command’s cache entry can be reused can be implemented with much fewer system calls and much less user space hashing making the architecture more efficient overall than ccache‘s.

The beauty of Firebuild is not being faster than ccache, but being faster than ccache with basically no hard-coded information about how C compilers work. It can accelerate any other compiler or program that generates deterministic output from its input, just by observing what they did in their prior runs. It is like having ccache for every compiler including in-house developed ones, and also for random slow scripts.

Scott Moser: Fixing the qemu serial console terminal size.

Mar, 21/03/2023 - 1:00pd
Fixing the qemu serial console terminal size.

Qemu can allow you to attach to a serial console of your guest. This is done with either -nographic or -serial mon:stdio. Its great for getting text output, but after you’ve logged in, you may find issues editing long command lines or using an editor such as vi.

To fix this, you simply need to tell linux what the terminal size you’re using is.

First, figure out what your terminal size is. You have to do this before connecting to qemu, or some other way such as another tmux window or terminal.

  • With bash via $LINES and $COLUMNS:

    $ echo rows $LINES columns $COLUMNS rows 75 columns 120
  • With stty, look for ‘rows’ and ‘columns’ below:

    $ stty -a speed 38400 baud; rows 75; columns 120; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; discard = ^O; min = 1; time = 0; -parenb -parodd -cmspar cs8 -hupcl -cstopb cread -clocal -crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc -ixany -imaxbel iutf8 opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke -flusho -extproc $ echo $TERM xterm

Then tell linux inside qemu about it via stty:

$ stty rows 75 columns 120 # now show what it is $ stty -a |head -n 1 speed 115200 baud; rows 75; columns 120; line = 0;

Last, you probably want to update your TERM variable :

$ export TERM=xterm

Thats it.

Andrea Corbellini: On ignoring mistakes, resilience, and the hidden dangers therein

Sht, 18/03/2023 - 9:20pd

As a scuba diver who often explores new places, I can say that I have found myself in some dangerous situations, but I always made it back to the surface without facing any negative consequences. Does this mean that I never made any mistakes? Absolutely not: mistakes were made, and lessons were learned.

We can all agree that learning from mistakes is good, but sometimes, when mistakes happen and consequences don’t manifest themselves immediately, we run the risk of not noticing them, not learning from them, repeating them, and over time developing a false sense of confidence, which can drive us to believe that our repeated mistakes are actually good practices.

Why do we ignore mistakes? Because sometimes outcomes are positive even if we make mistakes. “I made it out of water even this time, this means that my dive was executed perfectly.” This is a common way of reasoning, but in reality, things are much more complex than that. There is a difference between correct execution and successful outcome, and the two should not be confused. In fact, everyone should know from experience that goals can be achieved even if the execution was sloppy and full of mistakes. Catastrophic consequences may happen if we fail to see that.

An example of the consequences of ignoring mistakes is given by the two space shuttle disasters: the Challenger disaster of 1986, and the Columbia disaster of 2003. Both these instances were caused by NASA leadership ignoring the concerns from the engineering teams. Problems that occurred in previous shuttle launches should have been a wake-up call for NASA leadership. Instead, all the previous successful launches and re-entries despite the problems were seen as accomplishments, and nourished the leadership’s overconfidence. “We have made it this time too, this means that all those concerns that engineers raised were excessive.”

The tendency of diverting from proper procedures, dismissing valid concerns, and ignoring problems, has a name: it’s called normalization of deviance. The driving force of normalization of deviance is overconfidence and the false belief that positive outcomes are inherently caused by correct executions.

Overconfidence and normalization of deviance can spread like a virus in an organization. It is important to be vigilant for signs of overconfidence in individuals, before it infects other people. I once had to deal with a manager who was a self-declared micromanager (and proud to be) but lacked technical foundations and knowledge of the product. He would consistently and quickly dismiss anything that he did not understand, and focus on short-term goals of questionable usefulness. Whenever his team would accomplish a goal, he would send a pumped-up announcement, often containing inaccuracies, and carefully skipping over the shortcomings of the solutions implemented. Given the apparent success of this management style, other managers started to follow his example. Soon after (in less than a year), the entire organization became a toxic environment where raising even the minimal concern was seen as an attack on the “great new vision”.

I see many parallels between this manager story and what is happening with ‘Twitter 2.0’ right now (although, I must say, in my case engineers did not get fired on the spot for speaking the truth). And with that manager, just like with ‘Twitter 2.0’, whenever problems occurred, those problems would either be ignored or blamed on the preexisting components built before the manager joined, never on the new, careless developments.

The truth however was that problems that occurred had been preannounced weeks, or months before, but concerns around them had been promptly dismissed due to being too challenging to address, and because “everything works right now, so that’s not a concern”.

The idea that everything must be correct because everything works, goals are achieved, and outcomes are successful, is a dangerous idea that can potentially have catastrophic consequences. It’s important to be critical and analytical, regardless of the outcome. This does not mean that success shouldn’t be celebrated, but that mistakes should be captured so that lessons can be learned from them, even if the final outcome was successful. Not learning from mistakes does not allow us to advance, and on the contrary can only lead us to repeat them. And if we keep repeating the same mistakes, sooner or later, those will have some negative consequences.

A common practice in the aviation industry is to write reports on incidents, close calls, and near misses, whenever they occur, even if the flight was concluded successfully and no injuries or damages occurred. These reports are collected in databases like the Aviation Safety Reporting System (which can be freely consulted online), so that flight safety experts and regulators can identify common failure scenarios and eventually introduce mechanisms to improve safety in the aviation industry. A key element of these reports is that they are not meant to put the blame on certain people, but rather focus on what chain of events led to a certain mistake. “Human mistake” is generally not a valid root cause: if a human was able to make a mistake, it means that a mechanism is missing that can either prevent the mistake or detect it before it causes any negative consequences.

Some companies in other industries have similar processes for writing reports or retrospectives when a mistake happens (regardless of the outcome), with the goal of finding proper root causes and preventing future mistakes. Amazon with its Correction of Error practice is a famous example.

I think introducing these practices in an organization can help to establish a healthy culture where finding mistakes and raising concerns is encouraged, rather than being oppressed. However these practices, by themselves, may not be enough to ensure that such a culture can be maintained over time, because people can always disagree on what is considered a ‘mistake’. Empathy is probably the key to a truly healthy culture that allows people to learn and advance.

There are also cases where we are aware of problems, and we see them as such, but we deliberately choose not to do anything about them. This is where resilience comes into play.

Resilience is generally a good quality to have. Resilience can give us the strength to go through long-term hardships, and can have positive effects on our tenancy and determination. But even resilience, when taken to the extreme, can be dangerous. Resilience can lead us to ignore problems, and not react to them. Resilience can make us tolerate a negative situation, without finding a proper strategy to cope with it.

Poor planning forces you to consistently work extra hours? Resist and keep going, until you burn out. The relationship with your partner doesn’t satisfy you? Resist and think that things will get better, while the relationship slowly deteriorates. Feel pain in your knee every time you run for more than 30 minutes? Resist and don’t go to see a doctor, the pain will go away, sooner or later… until you cannot run anymore.

When we let resilience become an excuse to avoid solving problems, we can end up in situations from which it’s difficult to recover.

It’s important to make a distinction between what is under our control and what is not. We can fix problems that are under our control, but in situations where we cannot directly change the course of things, finding an alternative strategy is the only way. Resisting and hoping that things will get better often does not give the expected outcome–on the contrary, it can be detrimental.

In the end, I think that the ‘practice’ of ignoring mistakes (because of the overconfidence built from successful outcomes) or ignoring problems (because of resilience taken to the extreme) are hidden time bombs, silently ticking, waiting for the right conditions before exploding. We need to be aware that just because things seem to work today, it doesn’t mean that we’re making the right decisions, and this can have consequences in the future. Being critical, analytical, empathetic, and honest is important to avoid these behaviors and the dangers that come with them.

Aaron Rainbolt: An In-depth Review of the Kubuntu Focus XE Gen 1

Pre, 17/03/2023 - 4:06pd

About four months ago, a friend of mine got me in touch with his employer at the time. I was in a bit of a bind, using rather old and slow hardware (a 3rd gen mobile i5) for my development work, and was hoping to get a newer, faster system to help make things less painful. My friend and his boss managed to get me hooked up with a bug bounty - in exchange for fixing one of several high-priority bugs in the KDE Desktop Environment, I would receive a new Kubuntu Focus system, either a KFocus XE laptop (which is what I ultimately chose), or a KFocus NX desktop. I jumped at the chance, and managed to play a significant role in resolving two of the bugs. I also got a much better computer.

A Kubuntu Focus XE laptop displaying my personal system’s screen.

That was about four months ago. And if you’ve ever gotten a glimpse of my workflow, you know just how much damage… er… testing I can do in four months. I do not treat my systems all that kindly. After having gone through the trials of over seven terabytes of SSD writes, thousands of software compiles, probably over a hundred VM installs, and three distrohops, I finally feel that I can give this system a proper review. For those of you who have been looking into the Kubuntu Focus systems, this should help you get an idea of what you can expect from them, and for those who are on the hunt for a good Linux laptop, this should hopefully help you choose a good one.

First impressions

The Kubuntu Focus XE is essentially a nicely modded Clevo laptop with Kubuntu 22.04 LTS and a suite of utilities and extra features. For those of you who don’t already know this, Kubuntu is an official flavor of the Ubuntu Linux operating system, using the KDE desktop rather than the GNOME desktop. I personally have always been a huge fan of KDE (in fact it was the desktop that came with the first “real” distro I ever used), which made the idea of contributing to KDE to get a laptop even more appealing. The Kubuntu installation that comes with the Kubuntu Focus has a ton of value-adding customizations, including hardware optimization, curated apps and libraries, a very efficient setup wizard, built-in file versioning and recovery, a special Quick Help Widget, and several custom-made utilities for changing special hardware and software settings that are difficult to change otherwise.

The actual laptop is quite thin and light compared to many older systems, and features quite a few I/O ports of various types, including a Gigabit Ethernet port (which is somewhat rare among modern laptops). The keyboard features backlighting, which is always a nice touch if you do a lot of typing, and the cooling system uses dual fans, which looks cool and does a good job of keeping the system cool while under heavy load. Other systems I’ve used would get quite toasty if you really hammered them, while this one gets warm in just one spot right where the CPU itself is located.

For specs, the system I received came with an i5-1135G7 processor, 32 GB of 3200 MHz RAM, and a 1 TB Samsung 970 EVO PLUS solid state drive. 32 GB of RAM is a godsend when you’re doing really intensive work (like running four or more VMs at the same time, something I’ve been known to do in the past). The SSD’s size and speed has also been quite welcome, especially since a lot of what I do involves copious amounts of disk writes.

All in all, things looked pretty impressive initially. The final verdict, however, would depend on how it performed in real-world use.

Kompilation

Having done some KDE development work, it seemed only fitting to use the new machine for KDE development. But not normal KDE development. I wasn’t going to just hack on some source code and do a few builds to test my changes.

One of the projects I maintain is KForge, which is a KDE development virtual appliance that has all of the needed build and runtime dependencies for almost all KDE software preinstalled. This is quite useful as it has been historically difficult to set up a fully functional KDE development environment due to missing build dependencies. You usually don’t know you’re missing a dependency until a build fails, and sometimes the logs left by the build failures are significantly less than helpful.

However, getting KForge to work just right was tricky for the same reason it’s tricky to get a decent KDE build environment going. So, if I was going to make sure I was pulling in all of the dependencies for all of the KDE software, I was going to have to compile almost everything in KDE.

By “almost everything”, I mean 300+ packages.

I ended up deciding to exclude some KDE applications from the build (namely Krita and Kdenlive, both of which I felt would just add too much bulk to the build and wouldn’t add all that much value), and there was one component I ended up excluding since it was defunct and wasn’t maintained anymore. But the remaining 300+ pieces of software needed built. The idea was that, if I got all the dependencies right, and if all of the KDE development branches were in buildable states, I would be able to successfully build everything, and that would tell me that I had succeeded at finding all of the dependencies.

Needless to say, compiling over 300 applications is virtually always going to be a large task even if the apps are somewhat small. Not only were some of these apps not small, many of them were juggernauts (especially KWin, the window manager used by KDE).

While the system’s fans do ramp up significantly and can be quite loud during compilation, the system itself actually remains responsive during such a job. Responsive enough for me to just keep using it while the compilation happens in the background. This was somewhat amazing, since my old system was known for randomly freezing when it got hit with a lot of disk writes, and so to be able to hammer the system like this in the background and still have it be usable during that time was massively helpful. Not to mention the dramatically reduced compile times compared to my older system.

Multitasking

Part of me isn’t sure why I do this, but I have a tendency to try and work on two or three projects at once, sometimes more. As a result, I generally have a slew of apps open all over the place that I switch between rapidly. As I write this part of the review, I’m review-writing, chatting on both IRC and Matrix, and trying to reproduce a bug in Lubuntu, all at the same time. As a result, a well-streamlined DE is very helpful.

KDE is, for me, very good at making work easy. But the Kubuntu Focus takes things a step further and makes it easy to access virtual desktops, something that Kubuntu usually doesn’t have enabled by default. Virtual desktops let me break up my work into different “screens”. The KFocus has four virtual desktops enabled out of the box, essentially giving me four screens at the same time, all on one screen. Not only that, but I can switch between virtual desktops with Ctrl+F1, Ctrl+F2, etc, allowing me to switch desktops without even having to click on the virtual desktop I want.

Additionally, the lots of RAM my system came with made multitasking comfortable from a performance perspective - lag was very rare and the times it happened, it was very small, and only happened during things like a massive compile job. (And even that rarely caused any lag.)

On the hardware side of things, multitasking is also helped by a good keyboard and touchpad - hard-to-press keys and a jumpy or insensitive touchpad can seriously hamper productivity. The keyboard on the KFocus XE is really nice and allows me to type at 100 WPM without problems (and that’s with a cat on my lap and distractions nearby - I could very likely go faster in an ideal situation). The mousepad was initially tricky since it’s so big and I would keep accidentally brushing it with my hand - this ended up becoming a non-issue with time as I got used to holding my hands away from it (which is probably better for avoiding carpal tunnel too). Oh, and the mousepad is also quite precise, responsive, and works well even when I’m doing things at high speeds. The only other problem I’ve noticed is that the touchpad will occasionally stop responding for no apparent reason. This happens infrequently, but when it happens, I’m able to simply switch to a TTY by pressing Ctrl+Alt+F3, then return to my desktop with Ctrl+Alt+F1 - this makes the touchpad work again.

Graphics

I am not a gamer, and the KFocus XE is not designed for heavy gaming. I have no clue how well this thing performs if you’re intending to run intensive 3D games on it. That being said, graphics are important even in the general productivity world to some degree.

The Kubuntu Focus XE boasts a fairly good Intel iRIS XE iGPU, and while Intel’s integrated graphics are not known for being the pixel-crunching behemoths that NVIDIA and AMD churn out, they’re generally great for light and moderate uses. That was exactly my use case.

Everything I threw at the machine graphics-wise worked swimmingly. Full HD video playback is free of stutters and screen tearing. Windows have no stuttering when dragging them (even with the window transparency and wobbly windows that the OS has enabled by default, which look really cool and smooth). Desktop animations are nice and smooth everywhere.

For 3D performance, I tried using thirdroom.io (which is basically an open-source Metaverse-like thing) in Chrome. My fans got a bit louder, but other than that it seemed to perform OK, though it was somewhat laggy. All of the graphics rendered relatively smoothly. The graphics performance for the Intel iGPU is listed on the KFocus website as being similar to the NVIDIA GeForce MX250 - benchmarks show the Intel iGPU as being as good or better than the MX250 in many tests and only slightly lacking in others.

The other important part of the graphics experience is the screen. Here I have one gripe… and before I describe it, I should say, I’m pretty sure this is my fault - my friend has not had this problem, and I’ve not seen any other reports of it. The one problem is screen burn-in. If stuff around the edges of the screen stays in the same spot too long, and then I turn the screen off and leave it that way for a while, I can see shadows of the old screen contents later, sometimes quite well. The reason I think this is my fault is because it didn’t happen at first, and then I left the laptop upside-down trying to improve its cooling airflow during a large compile job. I left it in this state overnight… and when I opened the laptop in the morning, I could see burn-in. Facepalm. I confirmed with my friend that KFocus XE is not intended to be left upside-down like that, and that I may have damaged the screen in so doing. Which would explain why the burn-in happens most near the edges of the screen and much less in the middle. Thankfully, the burn-in mostly goes away on its own if I leave the screen on long enough, and I can use the LCD Scrub screensaver from XScreenSaver to help it out (which may not even be necessary). Other than that, the screen seems to have survived my accidental mistreatment without problems.

Apart from my most likely self-inflicted burn-in issue, the screen is amazing. Colors are bright and crisp, and details are easily visible even when they’re small thanks to the Full HD resolution. The closest thing I’ve had to a monitor this good was a ViewSonic Full HD monitor, and both of the ones I have are dead. Every other Full HD monitor I have either has horrible color or something wrong with the backlight - the KFocus XE screen has neither problem. Coming from a 1366x768 screen, this high-res screen has been a gigantic relief and makes multitasking a lot easier. I can’t think of any way the screen could be better - if it was any higher resolution, I’d need display scaling due to its size, and that would essentially negate the best effect of having a higher res monitor (more screen real estate).

As a whole, the graphics and screen of the KFocus XE have surpassed my expectations and are perfect for everything I do.

Mobility

A good laptop generally has to be relatively lightweight, have good battery life, and have good WiFi range in order to be useful as a mobile system. In my experience, the Kubuntu Focus XE has all three.

For weight, the XE is, as discussed earlier, quite thin and light. Unlike my older system, which was a leg-squisher and could end up causing actual pain if you left it on your lap too long, the XE can remain on my lap for extended periods of time without problems. It’s not as light as my tablet (an HP Chromebook x2), but it’s fairly close, which is surprising considering that my tablet is a fanless machine with a rather pathetic processor. Whereas the KFocus XE has dual fans and a powerful processor.

Battery life is also quite good. I generally prefer battery lifetime over “how long can I use the battery before it drains dry”, so I don’t have my XE disconnected from its charger for super long, and I use the FlexiCharger feature, which I have set to make the laptop not charge at all until it get to below 40% and stop charging once it hits about 80%. Most of the time, I only use 25% of the battery life at a time, if that. That lasts me for what I would guess is an hour of intensive use, and much longer when in a power-saving mode. According to the Kubuntu Focus team, the battery should last from four hours under heavy use, to as much as ten hours under light use. That fits with my experience, and that also means that the battery in this should last longer than my ChromeOS tablet (which only gives about six to seven hours).

As for WiFi range, I currently live in a relatively small house where each room is about two rooms away from the room furthest away from it. One would think a house of this size would make it tricky to thoroughly review the WiFi range, but the smallish house is offset by my absolutely pathetic router (alright so it’s a cellular mobile hotspot about the size of half a deck of cards). I have the hotspot on one side of my house, yet I still get connectivity and decent speeds on the other side of the house most of the time. Considering the very poor quality of the WiFi signal provided by the hotspot, this is really good, and it works perfectly for what I do.

Networking and Connectivity

The WiFi chip in the KFocus XE is the Intel AX201, which is a WiFi 6 device that is theoretically capable of 2.4 Gbps speeds. My hotspot doesn’t provide anything even close to that, but while I may not get the full speed of the chip, I definitely get the full speed of my hotspot, as I see no difference in speeds whether I use a WiFi connection or a direct USB connection to the hotspot.

The Ethernet included in the KFocus XE is a fairly standard Gigabit Ethernet port with a Realtek controller. It works flawlessly for me, and gives me full gigabit speeds when transferring data from one computer to another via an Ethernet cable. I just set one of the computers to share an IPv4 connection, then put one end of the cable in the laptop, the other end of the cable in the other computer, and then use netcat to beam whatever I need transferred. I can even share an Internet connection to another computer doing this. Ethernet is increasingly rare in modern laptops, yet still very handy in many instances, so this is a very nice feature to have.

As for the other ports on the system, there’s a pair of USB3 ports, a pair of USB-C ports (one of which supports Thunderbolt 4), a headphone jack, an SD card slot, and an HDMI port. This is more than sufficient for my uses, and for those who need more ports, the Thunderbolt 4 port should let you attach a docking station to extend the system.

Another connectivity feature I use is Bluetooth. In my experience, Bluetooth has been quite finicky on other laptops I’ve used, so I didn’t have very high expectations here. However, surprisingly, things mostly worked as I would have expected them to. My Bluetooth earbuds connected without problems, and while they are a little fiddly when getting them reconnected, they reconnect pretty easily and work well once connected. (I disconnect and reconnect them manually in the Bluetooth menu, then they work right.)

Aesthetics

The KFocus XE not only performs beautifully, it also looks beautiful IMO. There are plenty of well-made design choices both within the OS and in the hardware itself. If you’ve ever used Kubuntu before, you’re already familiar with KDE’s aesthetics. The customized Kubuntu installation that comes with the KFocus XE keeps the same basic look-and-feel, but also throws in a few more nice-looking (and even functional) design elements.

The most noticeable of these additions is the Quick Help Widget on the desktop (visible in the screenshot near the start of the review). The widget looks sharp and complements the rest of the design rather well. It has several pages inside loaded with numerous keyboard shortcuts, terminal commands, and even a Vim cheat sheet.

There’s also a couple of window effects enabled by default that I mentioned earlier - transparency and wobbly windows. In essence, when you grab a window in order to move it across the screen, the window becomes transparent, allowing you to see underneath it while moving it. The window will also “flow” as you move it, making the movement look smoother. The wobbly windows effect also causes windows to visibly “snap” into place if you use window snapping, which is kind of the visual equivalent of tactile feedback - you can “feel” the window snap into place properly.

On the hardware side of things, there are several nice touches that add to the system’s aesthetic value, most noticeably the hexagonal hinges on the screen, the Kubuntu logo on the Meta key (where you usually see a Windows logo on most laptops), and the mitered angles on the edges of the recessed keyboard. The backlighting on the keyboard is also a nice touch (both functionally and aesthetically).

Setup

I saved this part for near the end since, in all likelihood, you’re only going to go through the setup procedure once per reinstall, and you’re going to reinstall very rarely, if ever. Most of what you do is going to be done once everything has been set up. But the setup procedure on the Kubuntu Focus is a major feature of the OS it ships with, so I felt it was worth reviewing.

This part was somewhat shocking for me. Most of the time, once I install a new OS on my system, it takes me a while to get fully set up. A lot of the apps I use aren’t installed by default, getting logged into everything takes time, etc. The worst part is that I have a very bad tendency to forget things, so I won’t end up getting some part of my workflow set up at all until I hit something where I need it to be set up. The Kubuntu Focus has an entire initial setup wizard that makes a massive amount of the initial setup work way easier. And there are a ton of preinstalled apps, many of which are ones I actually use. The wizard has several neat features built into it, including helping me remember to log into my email. (That part was particularly helpful since I’m pretty sure I’ve had at least one time when I only remembered to log into Thunderbird because I went to check my email.)

After a few minutes of downloads and logins, most of everything I needed was set up. I did end up having to install a few apps myself, as well as my Ubuntu development utilities. And I think I found one small bug in the wizard when I rejected VirtualBox’s Personal Use and Evaluation License for their expansion pack because I didn’t want to use Oracle’s proprietary VBox addons and comply with the restrictions that came with them. Doing so still let VirtualBox install (which was correct), but also left it marked as autoremovable (not correct). No big deal, I ended up installing a newer version of VirtualBox from the main VBox website anyway. It’s a setup wizard, it’s not going to magically read your mind and set up everything perfectly the first time. But it got pretty close, and that was close enough to save me a ton of setup time.

One unique part of the setup process is the “Curated Apps” feature. Somehow (and I don’t fully understand how), the Kubuntu Focus people have a website where you can click on the icon of an app you want to install (from a list of apps they recommend), and it will automatically do the installation for you. You have to type your password (thankfully!), but aside from that, it’s a one-touch installation procedure. Through a website. (I’m guessing there’s a client-side thingy built into the OS that makes this work.) It can even launch an app if its already installed or if you’ve just installed it. I installed several apps through the Curated Apps page, including VSCode, which is usually a bit tricky to set up. The installation process for each app was simpler even than using the Google Play store.

The whole setup procedure was drastically different from the usual “Bailing out, you’re on your own, good luck” post-install experience most other operating systems leave you with. It was an extremely good intro to a fantastic system, and it helped me a lot.

Distrohopping

The built-in Kubuntu operating system works quite well and should be all most people need. But some of us either need to use a different distro, or want to explore other distros on our physical hardware. So far I haven’t tried distros other than official Ubuntu flavors, since I really like the Ubuntu ecosystem. But within the Ubuntu ecosystem at least, the KFocus XE works like a champ.

I tried two other Ubuntu flavors (Lubuntu 23.04 Alpha and Ubuntu Unity 22.10), and both of them worked very well out of the box. I assume that other distros like Fedora, Arch, or openSUSE would very likely work well, since the hardware in the KFocus is quite Linux-friendly.

There are a couple of downsides of using a different Ubuntu flavor on the KFocus XE. For one, you lose a bunch of the tools that the Kubuntu Focus Suite has built in, like the fan profile settings. You may also not end up being able to set a power profile like KDE allows you to do. You can install the Kubuntu Focus Tools on other distros, however this will probably pull in a lot of KDE dependencies since the Kubuntu Focus Tools, are, unsurprisingly, Kubuntu-focused. Personally, I just lived without them when I was trying out different distros.

The other downside is the loss of system component curation. One hidden feature that you usually won’t see on the KFocus is the fact that the Kubuntu Focus team tests updates to several important components, such as the curated apps and the kernel, before releasing those updates to end-user systems. So if, for instance, a kernel update breaks some hardware, you won’t get that update, and your hardware will continue to work right. When you’re using other distros, the Kubuntu Focus team no longer can control those updates, and so if an update breaks something, you’ll probably end up seeing that breakage.

Aside from those two problems, the user experience when using different Ubuntu flavors was flawless for me. All hardware worked out of the box without issues (except for Bluetooth was much finickier on Ubuntu Unity than it was on Kubuntu, which may have been a consequence of lacking the curated system components). The system performed perfectly well, and I was able to use it just like I’d use any other computer I installed Linux on.

Pretty much anything with a Linux 5.17 or newer kernel should work, if I’m understanding correctly. (The 5.15 kernel is probably not ideal.) Ubuntu 22.04.2 or higher should work out of the box. There aren’t any required third-party drivers, so as long as your distro of choice ships with a reasonable set of in-kernel drivers, everything should just work out of the box, even if you’re using a non-Ubuntu distro.

Final thoughts

I mean, this is the best laptop I’ve ever used in my life. It’s fast, it’s powerful, everything works smoothly, it looks nice, and it makes it easy to get work done. That’s what a good laptop should do. If you’re into Linux and want a laptop that just works, I’d highly recommend the Kubuntu Focus XE. It’s amazing.

Thanks for taking a look, and I hope this has helped you in some way. Thanks to Michael and the Kubuntu Focus team for making these awesome systems, and for giving me the opportunity to work for one!

Andrea Corbellini: Authenticated encryption: why you need it and how it works

Enj, 09/03/2023 - 7:35md

In this article I want to explore a common problem of modern cryptographic ciphers: malleability. I will explain that problem with some hands-on examples, and then look in detail at how that problem is solved through the use of authenticated encryption. I will describe in particular two algorithms that provide authenticated encryption: ChaCha20-Poly1305 and AES-GCM, and briefly mention some of their variants.

The problem

If we want to encrypt some data, a very common approach is to use a symmetric cipher. When we use a symmetric cipher, we hold a secret key, which is generally a sequence of bits chosen at random of some fixed length (nowadays ranging from 128 to 256 bits). The symmetric cipher takes two inputs: the secret key, and the message that we want to encrypt, and produces a single output: a ciphertext. Decryption is the inverse process: it takes the secret key and the ciphertext as the input and yields back the original message as an output. With symmetric ciphers, we use the same secret key both to encrypt and decrypt messages, and this is why they are called symmetric (this is in contrast with public key cryptography, or asymmetric cryptography, where encryption and decryption are performed using two different keys: a public key and a private key).

Generally speaking, symmetric ciphers can be divided into two big families:

  • stream ciphers, which can encrypt data bit-by-bit;
  • block ciphers, which can encrypt data block-by-block, where a block has a fixed size (usually 128 bits).

As we will discover soon, both these two families exhibit the same fundamental problem, although they slightly differ in the way this problem manifests itself. To understand this problem, let’s take a close look at how these two families of algorithms work and how we can manipulate the ciphertexts they produce.

Stream ciphers

A good way to think of a stream cipher is as a deterministic random number generator that yields a sequence of random bits. The secret key can be thought of as the seed for the random number generator. Every time we initialize the random number generator with the same secret key, we will get exactly the same sequence of random bits out of it.

The bits coming out of the random number generator can then be XOR-ed together with the data that we want to encrypt: ciphertext = random sequence XOR message, like in the following example:

random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F ⊕ message: I would really like an ice cream right now = ciphertext: zB686Y0H46144HwT9RQQR6vZV1gU1779n390ZCqXV1

The XOR operator acts as a toggle that can either flip bits or keep them unchanged. Let me explain with an example:

  • a XOR 0 = a
  • a XOR 1 = NOT a

If we XOR “something” with a 0 bit, we get “something” out; if we XOR “something” with a 1 bit, we get the opposite of “something”. And if we use the same toggle twice, we return to the initial state:

  • a XOR b XOR b = a

This works for any a and any b and it’s due to the fact that b XOR b is always equal to 0. In more technical terms, each input is its own self-inverse under the XOR operator.

The self-inverse property gives us a way to decrypt the message that we encrypted above: all we have to do is to replay the random sequence and XOR it together with the ciphertext:

random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F ⊕ ciphertext: zB686Y0H46144HwT9RQQR6vZV1gU1779n390ZCqXV1 = message: I would really like an ice cream right now

This works because ciphertext = random sequence XOR message, therefore random sequence XOR ciphertext = random sequence XOR random sequence XOR message. The two random sequence are the same, so they cancel each other (self-inverse), leaving only message:

random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F ⊕ random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F ⊕ message: I would really like an ice cream right now = message: I would really like an ice cream right now

Only the owner of the secret key will be able to generate the random sequence, therefore only the owner of the secret key should, in theory, be able to recover the message using this method.

Playing with stream ciphers

The self-inverse property not only allows us to recover the message from the random sequence and the ciphertext, but it also allows us to recover the random sequence if can correctly guess the message:

message: I would really like an ice cream right now ⊕ ciphertext: zB686Y0H46144HwT9RQQR6vZV1gU1779n390ZCqXV1 = random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F

This “feature” opens the door to at least two serious problems. If we are able to correctly guess the message or a portion of it, then we can:

  1. decrypt other ciphertexts produced by the same secret key (or at least portions of them, depending on what portions of the random sequence we were able to recover);
  2. modify ciphertexts.

And we can do all of this without any knowledge of the secret key.

The first problem implies that key reuse is forbidden with stream ciphers. Every time we want to encrypt something with a stream cipher, we need a new key. This problem is easily solved by the use of a nonce (also known as initialization vector, IV, or starting variable, SV): a random value that is generated before every encryption, and that is combined in some way with the secret key to produce a new value to initialize the random number generator. If the nonce is unique per encryption, then we can be sufficiently confident that the random sequence generated will also be unique. The nonce value does not necessarily need to be kept secret, and needs to be known at decryption time. Nonces are usually generated at random at encryption time and stored alongside the ciphertext.

The second problem is a bit more subtle: if we have a ciphertext and we can correctly guess the original message that produced it, we can modify it using the XOR operator to “cancel” the original message and “insert” a new message, like in this example:

ciphertext: zB686Y0H46144HwT9RQQR6vZV1gU1779n390ZCqXV1 ⊕ message: I would really like an ice cream right now ⊕ altered message: I would really like to go to bed right now = tampered ciphertext: zB686Y0H46144HwT9RQQG7vTZt3Yc030n390ZCqXV1

This message, when correctly decrypted with the secret key, will return the tampered ciphertext without detection!

Note that I do not need to know the full message to carry out this technique, in fact, the following example (where unknown parts have been replaced by hyphens) produces the same result as the above one:

ciphertext: zB686Y0H46144HwT9RQQR6vZV1gU1779n390ZCqXV1 ⊕ message: --------------------an ice cream---------- ⊕ altered message: --------------------to go to bed---------- = tampered ciphertext: zB686Y0H46144HwT9RQQG7vTZt3Yc030n390ZCqXV1

This problem is known as malleability, and it’s a serious issue in the real world because most of the messages that we exchange are in practice relatively easy to guess.

Suppose for example that I have control over a WiFi network, and I can inspect and alter the internet traffic that passes through it. Suppose that I know that a person connected to my WiFi network is visiting an e-commerce website and that they’re interested in a particular item. The traffic that your browser exchanges with the e-commerce website may be encrypted, and therefore I won’t be able to decrypt its contents, but I might be able to guess certain parts of it, like the HTTP headers sent by the website, or some parts of the HTML that are common to all pages on that website, or even the name and the price of the item you want to buy. If I can guess that information (which is public information, not a secret, and it’s generally easy to guess), then I might be able to alter some parts of the web page, showing you false information, and altering the price that you see in an attempt to trick you into buying that item.

An example of malleability using ChaCha20 with OpenSSL

Here’s a practical example of how we can take the output of a stream cipher, and alter it as we wish without knowledge of the secret key. I’m going to use the OpenSSL command line interface to encrypt a message with a stream cipher: ChaCha20. This is a modern, fast, stream cipher with a good reputation:

openssl enc -chacha20 \ -K 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \ -iv 0123456789abcdef0123456789abcdef \ -in <(echo 'I would really like an ice cream right now') \ -out ciphertext.bin

The -K option specifies the key in hexadecimal format (256 bits, or 32 bytes, or 64 hex characters), the -iv is the nonce, also known as initialization vector (128 bits, or 16 bytes, or 32 hex characters).

This trivial Python script can tamper with the ciphertext:

with open('ciphertext.bin', 'rb') as file: ciphertext = file.read() guessed_message = b'--------------------an ice cream----------\n' replacement_message = b'--------------------to go to bed----------\n' tampered_ciphertext = bytes(x ^ y ^ z for (x, y, z) in zip(ciphertext, guessed_message, replacement_message)) with open('tampered-ciphertext.bin', 'wb') as file: file.write(tampered_ciphertext)

This script is using partial knowledge of the message. It knows (thanks to an educated guess) that the original message contained the words “an ice cream” at a specific offset, and uses that knowledge to replace those words with new ones (“to go to bed”) which add up to the same length. Note that this technique cannot be used to remove or add parts from the message, only to modify them without changing their length.

Now if we run this script and we decrypt the tampered-ciphertext.bin file with the same key and nonce as before, we get “to go to bed” instead of “an ice cream”, without any error indicating that tampering occurred:

openssl enc -d -chacha20 \ -K 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \ -iv 0123456789abcdef0123456789abcdef \ -in tampered-ciphertext.bin Block ciphers

We have seen that stream ciphers alone have a serious problem (malleability) that allows anyone to modify arbitrary portions of ciphertexts without detection. Let’s take a look at the alternative: block ciphers. Will they have the same problem?

While a stream cipher can encrypt variable amounts of data, a block cipher can only take as input a block of data of a fixed size, and produce as output another block of data. A good block cipher produces an output that is indistinguishable from random.

The block size is generally small, usually 128 bits (16 bytes), so if we want to encrypt larger amounts of data, we have to split the data into multiple blocks, and encrypt each block individually. If the data is too short to fit in a block, the data will also need to be padded.

message: The cat is on th e table......... |______________| |______________| block #1 block #2 (padded) ciphertext: c2TNPW3r09hZ6f1P Vc32VX41XSy579Y9

This approach however has a problem: if we encrypt multiple blocks with the same secret key, then portions of messages that are the repeated will produce the same output. This gives the ability to analyze a ciphertext and find patterns in it without knowledge of the secret key. This problem is famously evident when encrypting pictures:

Example of applying a block cipher to an uncompressed image. The original colors are lost, but the overall layout of the image is still understandable. That's because multiple blocks of the image (containing the RGB values of each pixel), for example from the white background, are repeated multiple times, yielding the same exact encrypted blocks. The inspiration for making this image came from Wikipedia. How this image was generated

Before jumping into how I encrypted the image, let me spend a few words on how I did NOT encrypt the image: I did not use a modern image format. Modern image formats are very sophisticated, they’re not a simple sequence of RGB values. Instead, they have some control structures mixed in the image, they implement compression to reduce the image size, etc. This complexity means that if I simply take an image in any format and encrypt it, the result won’t be visualizable by an image viewer: the image viewer would just throw an error because it would find invalid data structures.

Note that this does not imply that encrypting modern image formats is more secure: people can still analyze patterns in them, but it simply means that a modern image format, once encrypted, would not produce the sensational visualization that I showed above.

In order to produce this visualization I had to find an uncompressed image format without too much metadata in it. Thankfully the Wikipedia article on image file formats provided a list, which included the Netpbm family of formats (something I never heard of before). Among the formats in this family, I chose PPM, because it’s the one that supports colors.

The PPM file format is very simple: it has 3 lines of metadata, followed by the RGB values for each pixel. No compression. Definitely the right format for this kind of experiment!

So here’s what I did: first of all I downloaded an image (the Ubuntu “Circle of Friends” logo, obtained from Wikipedia) and converted it to PPM with ImageMagick:

convert UbuntuCoF.png img.ppm

I separated the header from the RGB values:

head -n3 img.ppm > ppm-header tail -n+4 img.ppm > ppm-image

The reason why I separated the header from RGB values is that I won’t encrypt the header. If I did, then the image won’t be visualizable by an image viewer, just like if I used a modern image format. In a real-world scenario, a person would be able to easily guess the header if it was encrypted.

I encrypted the RGB values with AES-256, a modern, strong block cipher with a good reputation:

openssl enc -aes-256-ecb \ -K 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \ -in ppm-image \ -out ppm-image-encrypted

Then I joined the header and the encrypted RGB values in a PPM file:

cat ppm-header ppm-image-encrypted > img-encrypted.ppm

This results in a randomized image that can be viewed without problems on an image viewer. It’s interesting to see that, if you change the encryption key, you will get a different randomized image!

The problem we have just seen is known as lack of diffusion. This is kinda analogous to the first problem we identified with stream ciphers: at the root of both problems there is key reuse. We solved this problem for stream ciphers by combining a key with a random nonce. We could use the same strategy here, would be an expensive approach, as initializing a block cipher with a new key is a relatively expensive operation. It’s much cheaper to initialize the block cipher once, and reuse it for every block encryption. We need a way to “link” blocks to each other, so that if two linked blocks contain the same plaintext, their encryption will give different results.

There are various strategies do that. These strategies are known as mode of operation of block ciphers. Let’s take a look at two of them: Cipher Block Chaining (CBC) and Counter Mode (CTR).

Cipher Block Chaining (CBC)

This mode of operation, as the name suggests, ‘chains’ each block to the next one. The way it works is by using the XOR operator in the following way:

  • First of all, a random nonce is generated. The purpose of the nonce is the same as before (with stream ciphers): ensuring that using the same secret key to perform multiple encryptions yields different results each time, so that secret information or patterns are not revealed.

    The nonce does not need to be kept secret and is normally stored alongside the ciphertext so that it can be easily used during decryption. It is however important that the nonce is unique.

  • The first block of message m[0] is XOR-ed with the nonce, and then encrypted the block cipher, producing the first block of ciphertext c[0] = block_encrypt(m[0] XOR nonce)

  • The second block of message m[1] is XOR-ed with c[0], and then encrypted with the block cipher: c[1] = block_encrypt(m2 XOR c[0])

  • The last block of message m[n] is XOR-ed together with c[n-1], and then encrypted with the block cipher: c[n] = block_encrypt(m[n] XOR c[n-1])

The XOR operator is back! With stream ciphers, the XOR operator was allowing us to tamper with ciphertexts. Can we do the same thing here? Yes of course! The approach is slightly different though: instead of acting directly on the block that we want to change, we will act on the block that precedes it.

For example, if we want to change the sentence “I came home in the afternoon and the cat was on the table” so that it reads ‘dog’ instead of ‘cat’, we would need to change the block right before the one that contains the word ‘cat’. If we want to change the very first block, for example to change the word ‘came’ to ‘left’, we would need to change the nonce instead.

nonce+ciphertext: yzURZRbP6X1w3ZRL XRDnPbEkx3JUP2Fv C2ZWt19EdAXDi76H pkbk8qTgaSdzerbF 8CWYqscBqE6cSLmx ⊕ message (shifted 1 block to the left): I came home in t he afternoon and the cat was on the table....... ⊕ altered message (shifted 1 block to the left): I left home in t he afternoon and the dog was on the table....... = tampered nonce+ciphertext: yzZVQCbP6X1w3ZRL XRDnPbEkx3JUP2Fv C2ZWt67VdAXDi76H pkbk8qTgaSdzerbF 8CWYqscBqE6cSLmx

If we do the above, and then decrypt the tampered ciphertext, we will get something like this:

I left home in t���������������the dog was on the table How to get this result using AES-CBC with OpenSSL

Here’s a step-by-step guide on how to tamper with a ciphertext encrypted with AES-256 in CBC mode.

First, generate a valid ciphertext:

openssl enc -aes-256-cbc \ -K 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \ -iv 0123456789abcdef0123456789abcdef \ -in <(echo 'I came home in the afternoon and the cat was on the table') \ -out ciphertext.bin

Like in the stream cipher example, -K is the key in hexadecimal format (256 bits), while -iv is the nonce (128 bits).

We can perform the tampering with this Python script:

with open('ciphertext.bin', 'rb') as file: ciphertext = file.read() guessed_message = b'---------------------cat----------------------------------------' replacement_message = b'---------------------dog----------------------------------------' tampered_ciphertext = bytes(x ^ y ^ z for (x, y, z) in zip(ciphertext, guessed_message, replacement_message)) with open('tampered-ciphertext.bin', 'wb') as file: file.write(tampered_ciphertext)

Note that OpenSSL does not store the nonce along with the ciphertext, but instead expects it to be passed as a command line argument. We need to modify it separately, so here’s another Python script just for the nonce:

nonce = bytes.fromhex('0123456789abcdef0123456789abcdef') guessed_message = b'--came----------' replacement_message = b'--left----------' tampered_nonce = bytes(x ^ y ^ z for (x, y, z) in zip(nonce, guessed_message, replacement_message)) print(tampered_nonce.hex())

If we run that script, we get: 01234a6382bacdef0123456789abcdef.

Now to decrypt the tampered ciphertext with the tampered nonce:

openssl enc -d -aes-256-cbc \ -K 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \ -iv 01234a6382bacdef0123456789abcdef \ -in tampered-ciphertext.bin

It’s interesting to see that we were successful in changing the word ‘cat’ to ‘dog’ but in doing so we had to sacrifice a block, which, when decrypted, resulted in random bytes.

In a real world scenario, seeing some random bytes could raise some suspicion, and maybe generate some errors in applications, however that’s not always the case (how many times have we seen garbled text on our monitors, and we never worried that somebody was tampering with our communications). Also, when dealing with formats like HTML, one could conceal tampering attempts using comment blocks, or using JavaScript. One example of what I’m describing is the EFAIL vulnerability: discovered in 2017, it affected some popular email clients including Gmail, it targeted the use of AES in CBC mode (as well as another mode very similar to it: Cipher Feedback, CFB), and allowed the injection of malicious content in HTML emails.

We can conclude that block ciphers in CBC mode, just like stream ciphers, are also malleable.

Counter Mode (CTR)

Are other modes of operation all malleable like CBC, or will they be different? Let’s take a look at another, very common, mode of operation: Counter Mode (CTR), so that we can get a better sense of how the problem of malleability can affect the world of block ciphers.

The mechanism behind Counter Mode is very simple:

  1. A random nonce is generated. The purpose of the nonce is the usual one: make sure that repeated encryptions using the same key produce different results.

  2. A counter (usually an integer) is initialized to 1 (or any other starting value of your choice).

  3. The nonce is concatenated with the counter, and encrypted using the block cipher: r[0] = block_encrypt(nonce || counter).

    Because the block cipher can only accept as input a block of a fixed size, it follows that the length of the nonce plus the length of the counter must be equal to the block size. For example, for a 128-bit block cipher, a common choice is to have a 96-bit nonce and a 32-bit counter.

  4. The counter is incremented: counter = counter + 1 (the increment does not necessarily need to be by 1, but that’s a common choice). The nonce and the new counter are concatenated again, and encrypted using the block cipher: r[1] = block_encrypt(nonce || counter).

  5. The counter is incremented again (counter = counter + 1), and a new block is encrypted, just like before: r[2] = block_encrypt(nonce || counter).

This mechanism produces a sequence of blocks r[0], r[1], r[2], … which are indistinguishable from random. This sequence of random blocks can be XOR-ed with the message to produce the ciphertext.

It’s important that the values for the counter never repeat. If, for example, we’re using a 32-bit counter, the counter will “reset” (go back to the starting value) after 232 iterations, and will start repeating the same sequence of random blocks as it did at the beginning. This introduces the problem of lack of diffusion that we have seen before, just at a larger scale. If we’re using a 32-bit counter with a 128-bit block cipher, we cannot encrypt more than 128·232 bits = 64 GiB of data at once. This is a very important detail: exceeding these limits may allow the decryption of portions of ciphertext without knowledge of the secret key.

What Counter Mode is doing is effectively turning a block cipher into a stream cipher. As such, a block cipher in Counter Mode has the exact same malleability problems of stream ciphers that we have seen before.

An example of malleability using AES-CTR with OpenSSL

This example is going to be very similar (almost identical) to the example with ChaCha20 that I showed in the stream cipher section, just that this time I’m going to use AES-256 in CTR mode.

Let’s produce a valid ciphertext:

openssl enc -aes-256-ctr \ -K 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \ -iv 0123456789abcdef0123456700000001 \ -in <(echo 'Can you give me a ride to the party?') \ -in <(echo 'Do not give me a ride to the party!') \ -out ciphertext.bin

Tamper it with Python:

with open('ciphertext.bin', 'rb') as file: ciphertext = file.read() guessed_message = b'--------------------an ice cream----------\n' replacement_message = b'--------------------to go to bed----------\n' tampered_ciphertext = bytes(x ^ y ^ z for (x, y, z) in zip(ciphertext, guessed_message, replacement_message)) with open('tampered-ciphertext.bin', 'wb') as file: file.write(tampered_ciphertext)

And now we can decrypt it:

openssl enc -d -aes-256-ctr \ -K 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \ -iv 0123456789abcdef0123456700000001 \ -in tampered-ciphertext.bin The solution: Authenticated Encryption (AE)

We have seen that stream ciphers and block ciphers (in their mode of operation) both exhibit the same problem (in different flavors): malleability. I’ve shown some examples of how this problem can be exploited with modern ciphers like ChaCha20 and AES. These ciphers, alone, cannot guarantee the integrity or authenticity of encrypted data.

In this context, integrity is the assurance that data is not corrupted or modified in any way. Authenticity can be thought of as a stronger version of integrity, and it’s the assurance that a given ciphertext was produced only with knowledge of the secret key.

Does this mean that modern ciphers, like ChaCha20 and AES, should be considered insecure and avoided? Absolutely not! The correct answer is that those ciphers cannot be used alone. You should think of them as basic building blocks, and you need some additional building blocks in order to construct a complete and secure cryptosystem. One of these additional building block, that we are going to explore in this article, is an algorithm that provides integrity and authentication: welcome Authenticated Encryption (AE).

When using authenticated encryption, an adversary may be able to modify a ciphertext using the techniques described above, but such modification would be detected by the authentication algorithm, and decryption will fail with an error. The decrypted message at that point should be discarded, preventing the use of tampered data.

There are many different methods to implement authenticated encryption. The most common approach is to use an authentication algorithm to authenticate the ciphertext produced by a cipher. Here I will describe two very popular authentication algorithms:

  • Poly1305, which is often used in conjunction with the stream cipher ChaCha20 to form ChaCha20-Poly1305;
  • and Galois/Counter Mode (GCM), which is often used with the block cipher AES to form AES-GCM.

These authentication algorithms work by computing a hash of the ciphertext, which is then stored alongside the ciphertext. This hash is not a regular hash, but it’s a keyed hash. A regular hash is a function that takes as input some data and returns a fixed-size bit string:

$$\operatorname{hash}: data \rightarrow bits$$

A keyed hash instead takes two inputs: a secret key and some data, and produces a fixed-size bit string:

$$\operatorname{keyed-hash}: (key, data) \rightarrow bits$$

The output of the keyed hash is more often called Message Authentication Code (MAC), or authentication tag, or even just tag.

During decryption, the same authentication algorithm is run again on the ciphertext, and a new tag is produced. If the new tag matches the original tag (that was stored alongside the ciphertext), then decryption succeeds. Else, if the tags don’t match, it means that the ciphertext was modified (or the tag was modified), and decryption fails. This gives us a way to detect tampering and gives us the opportunity to reject ciphertexts that were not produced by the secret key.

The secret key passed to the keyed hash function is not necessarily the same secret key used for the encryption. In fact, both ChaCha20-Poly1305 and AES-GCM operate on a subkey derived from the key used for encryption.

Poly1305

Poly1305 is a keyed hash function proposed by Daniel J. Bernstein in 2004. It works by using polynomials evaluated modulo the prime 2130 - 5, hence the name.

The key to Poly1305 is a 256-bit string, and it’s split into two halves:

  • the first half (128 bits) is called $r$;
  • the second half (128 bits) is called $s$.

We’ll see later how this key is generated when Poly1305 is used to implement authenticated encryption. For now, let’s assume that the key is a random (unpredictable) bit string provided as an input.

The first half $r$ is also clamped by setting some of its bits to 0. This is a performance-related optimization that some Poly1305 implementations can take advantage of when doing multiplication using 64-bit registers. Clamping is performed by applying the following hexadecimal bitmask:

0ffffffc0ffffffc0ffffffc0fffffff

The message to authenticate is split into chunks of 128 bits each: $m_1$, $m_2$, $m_3$, … $m_n$. If the length of the message is not a multiple of 128 bits, then the last block may be shorter. The authentication tag is then calculated as follows:

  • Interpret $r$ and $s$ as two 128-bit little-endian integers.

  • Initialize the Poly1305 state $a_0$ to the integer 0. As we shall see later, this state will need to hold at most 131 bits.

  • For each block $m_i$:

    • Interpret the block $m_i$ as a little-endian integer.

    • Compute $\overline{m}_i$ by appending a 1-bit to the end of the block $m_i$. If $m_i$ is 128 bits long, then this is equivalent to computing $\overline{m}_i = 2^{128} + m_i$. In general, if the length of the block $m_i$ in bits is $\operatorname{len}(m_i)$, then this is equivalent to $\overline{m}_i = 2^{\operatorname{len}(m_i)} + m_i$.

      This step ensures that the resulting block $\overline{m}_i$ is always non-zero, even if the original block $\overline{m}_i$ is zero. This is important for the security of the algorithm, as explained later.

    • Compute the new state $a_i = (a_{i-1} + \overline{m}_i) \cdot r \pmod{2^{130} - 5}$. Note that, because the operation is modulo $2^{130} - 5$, the result will always fit in 130 bits.

  • Once each block has been processed, compute the final state $a_{n+1} = a_n + s$. Note that the state $a_n$ is at most 130 bits long, and $s$ is at most 128 bits long, hence the result will be at most 131 bits long.

  • Truncate the final state $a_{n+1}$ to 128 bits by removing the most significant bits.

  • Return the truncated final state $a_{n+1}$ as a little-endian byte string.

What this method is doing is computing the following polynomial in $r$ and $s$:

$$\begin{align*} tag & = ((((((\overline{m}_1 \cdot r) + \overline{m}_2) \cdot r) + \cdots + \overline{m}_n) \cdot r) \bmod{(2^{130} - 5)}) + s \\ & = (\overline{m}_1 r^n + \overline{m}_2 r^{n-1} + \cdots + \overline{m}_n r) \bmod{(2^{130} - 5)} + s \end{align*}$$

$r$ and $s$ are secrets, and they come from the Poly1305 key. Note that if we didn’t add $s$ at the end, then the resulting polynomial would be a polynomial in $r$, and one could use polynomial root-finding methods to figure out $r$ from the authentication tag, without knowledge of the key. Therefore it’s important that $s$ is non-zero.

In Python, this is what a Poly1305 implementation could look like (disclaimer: this is for learning purposes, and not necessarily secure or optimized for performance):

def iter_blocks(message: bytes): """ Splits a message in blocks of 16 bytes (128 bits) each, except for the last block, which may be shorter. """ start = 0 while start < len(message): yield message[start:start+16] start += 16 def poly1305(key: bytes, message: bytes): assert len(key) == 32 # 256 bits # Prime for the evaluation of the polynomial p = (1 << 130) - 5 # Split the key into two parts r and s r = int.from_bytes(key[:16], 'little') # 128 bits s = int.from_bytes(key[16:], 'little') # 128 bits # Clamp r r = r & 0x0ffffffc0ffffffc0ffffffc0fffffff # Initialize the state a = 0 # Update the state with every block for block in iter_blocks(message): # Append a 1-bit to the end of each block block = block + b'\1' # Convert the block to an integer c = int.from_bytes(block, 'little') # Update the state a = ((a + c) * r) % p # Add s to the state and truncate it to 128 bits, removing the most # significant bits and keeping only the least significant 128 bits a = (a + s) & ((1 << 128) - 1) # Convert the state from an integer to a 16-byte string (128 bits) return a.to_bytes(16, 'little')

And here is an example of how that code could be used:

key = bytes.fromhex('0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef') msg = b'I had a very nice day today at the beach' print(poly1305(key, msg).hex())

This returns b0c4cb74b3089e9a982e3baa90c1bb5f, which is the same result that we would get using OpenSSL:

openssl mac \ -macopt hexkey:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \ -in <(echo -n 'I had a very nice day today at the beach') \ poly1305

A few things to note:

  • The same key cannot be reused to construct two distinct tags. In fact, suppose that we use the same hash key to compute tag1 = Poly1305(key, msg1) and tag2 = Poly1305(key, msg2). Then, because $s$ is the same for both, we could subtract the two tags (tag1 - tag2) to remove the $s$ part and obtain a polynomial in $r$. From there, we could use algebraic methods to figure out $r$. Once we have $r$, we can use either one of the tags and compute $s$, therefore recovering the full secret key.

    Similarly, if the keys were generated using a predictable algorithm (for example, incrementally: key[i+1] = key[i] + 1), it would still be possible to use a similar approach to figure out the secret key.

    For this reason, Poly1305 keys must be unique and unpredictable. Generating Poly1305 keys randomly or pseudo-randomly is an acceptable approach. Authentication functions like Poly1305 are called one-time authenticators because they can be used only one time with the same key.

  • If we didn’t add the 1-bits at the end of each block (in other words, if we used the $m_i$ blocks instead of $\overline{m}_i$), then encrypting a message full of zero bits would be the equivalent of encrypting an empty message. Adding the 1-bits is a way to ensure that the length of the message always has an effect on the output.

Use of Poly1305 with ChaCha20 (ChaCha20-Poly1305)

Let’s see how we can combine ChaCha20 and Poly1305 to construct an authenticated cipher. To recap:

  • ChaCha20 is a stream cipher;
  • Poly1305 is a one-time authenticator;
  • ChaCha20, like most ciphers, requires the use of a unique nonce to allow key reuse.

Putting the two together gives birth to ChaCha20-Poly1305. Here I’m going to describe how to implement it as standardized in RFC 8439.

The inputs to the ChaCha20-Poly1305 encryption function are:

  • a 256-bit secret key;
  • a 96-bit nonce;
  • a variable-length plaintext message.

The outputs from the ChaCha20-Poly1305 encryption function are:

  • a variable-length ciphertext (same length as the input plaintext);
  • a 128-bit authentication tag.

The ChaCha20-Poly1305 decryption function will accept the same secret key, nonce, ciphertext, and authentication tag as the input, and produce either the plaintext or an error as the output. The error is returned in case the authentication fails.

Data flow during a ChaCha20-Poly1305 encryption. This shows the inputs in blue, the outputs in green, and the intermediate objects in red.

ChaCha20-Poly1305 works in the following way:

  1. The ChaCha20 stream cipher is initialized with the 256-bit secret key and the 96-bit nonce.

  2. The stream cipher is used to encrypt a 256-bit string of all zeros. The result is the Poly1305 subkey.

    If you recall how a stream cipher works, you should know that encrypting using a stream cipher is equivalent to performing the XOR of a random bit stream with the plaintext. Here the plaintext is all zeros, so the process of generating the Poly1305 subkey is equivalent to grabbing the first 256 bits from the ChaCha20 bit stream.

    We previously saw that the Poly1305 subkey must be unpredictable and unique in order for Poly1305 to be secure. The use of ChaCha20 with a unique nonce ensures that: because ChaCha20 is a stream cipher, its output will be random and unpredictable. Therefore, with this construction, the subkey will be unpredictable even if the nonce is predictable.

  3. The stream cipher is used to encrypt another 256-bit string. The result is discarded. This is equivalent to advancing the stream cipher state by 256 bits.

    This step may seem weird, and in fact is not needed for security purposes, but it’s a mere implementation detail. This step is here because ChaCha20 has an internal state of 512 bits. In the previous step we obtained the first 256 bits of the state, and this next step is to discard the rest of the state to start with a fresh state. There is no particular reason for requiring a fresh state. The reason why RFC 8439 does that is because… spoiler alert: ChaCha20 is a block cipher under the hood. Its block size is 512 bits. If you read the RFC, you’ll see that it asks to call the ChaCha20 block encryption function once, grab the first 256 bits, and discard the rest. Here I’m using ChaCha20 as a stream cipher, so I have to include this extra step to discard the bits.

  4. The plaintext is encrypted using the stream cipher.

    Note that this is done without resetting the state of the cipher. We are continuing to use the same stream cipher instance that was used to generate the Poly1305 subkey.

  5. The ciphertext is padded with zeros to make its length a multiple of 16 bytes (128 bits) and is authenticated using Poly1305, via the subkey generated in step 2.

    This step may be done in parallel to the previous one, that is: every time we generate a chunk of ciphertext, we feed it to the Poly1305 authentication function.

    Why pad the ciphertext before passing it to Poly1305? After all, ChaCha20 is a stream cipher, and Poly1305 can accept arbitrary-sized messages. Again, this is an detail of RFC 8439 and padding does not serve any specific purpose.

  6. The length of the ciphertext (in bytes) is fed into the Poly1305 authenticator. This length is represented as a 64-bit little-endian integer padded with 64 zero bits.

    The reason why the length is represented as 64 bits and padded (instead of representing it as 128 bits) will be clearer later: what I have given you so far is a simplified view of ChaCha20-Poly1305 and authenticated encryption in general. I will give you the full picture when talking about associated data later on, and at that point this step will be slightly modified.

  7. The ciphertext from ChaCha20 and the authentication tag from Poly1305 are returned.

The decryption algorithm works in a very similar way: ChaCha20 is initialized in the same way, the subkey is generated in the same way, the Poly1305 authentication tag is calculated from the ciphertext in the same way. The only difference is that ChaCha20 is used to decrypt the ciphertext (instead of encrypting the plaintext) and that the input authentication tag is compared to the calculated authentication tag before returning.

Here is a Python implementation of ChaCha20-Poly1305, based on the implementations of ChaCha20 and Poly1305 from pycryptodome (usual disclaimer: this code is for educational purposes, and is not necessarily secure or optimized for performance):

from Crypto.Cipher import ChaCha20 from Crypto.Hash import Poly1305 def chacha20poly1305_encrypt(key, nonce, message): assert len(key) == 32 # 256 bits assert len(nonce) == 12 # 96 bits # Initialize the ChaCha20 cipher with the key and nonce cipher = ChaCha20.new(key=key, nonce=nonce) # Derive the Poly1305 subkey using the ChaCha20 cipher subkey = cipher.encrypt(b'\0' * 32) # 256 bits subkey_r = subkey[:16] subkey_s = subkey[16:] # Initialize the Poly1305 authenticator with the subkey authenticator = Poly1305.Poly1305_MAC(r=subkey_r, s=subkey_s, data=None) # Discard the rest of the internal ChaCha20 state cipher.encrypt(b'\0' * 32) # 256 bits # Encrypt the message ciphertext = cipher.encrypt(message) # Authenticate the ciphertext authenticator.update(ciphertext) # Pad the ciphertext with zeros (to make it a multiple of 16 bytes) if len(ciphertext) % 16 != 0: authenticator.update(b'\0' * (16 - len(ciphertext) % 16)) # Authenticate the length of the associated data (0 for simplicity) authenticator.update((0).to_bytes(8, 'little')) # 64 bits # Authenticate the length of the ciphertext authenticator.update(len(ciphertext).to_bytes(8, 'little')) # 64 bits # Generate the authentication tag tag = authenticator.digest() return (ciphertext, tag) def chacha20poly1305_decrypt(key, nonce, ciphertext, tag): assert len(key) == 32 # 256 bits assert len(nonce) == 12 # 96 bits assert len(tag) == 16 # 128 bits # Initialize the ChaCha20 cipher and the Poly1305 authenticator, in the # same exact way as it was done during encryption cipher = ChaCha20.new(key=key, nonce=nonce) subkey = cipher.encrypt(b'\0' * 32) subkey_r = subkey[:16] subkey_s = subkey[16:] authenticator = Poly1305.Poly1305_MAC(r=subkey_r, s=subkey_s, data=None) cipher.encrypt(b'\0' * 32) # Generate the authentication tag, like during encryption authenticator.update(ciphertext) if len(ciphertext) % 16: authenticator.update(b'\0' * (16 - len(ciphertext) % 16)) authenticator.update((0).to_bytes(8, 'little')) authenticator.update(len(ciphertext).to_bytes(8, 'little')) expected_tag = authenticator.digest() # Compare the input tag with the generated tag. If they're different, the # plaintext must not be returned to the caller if tag != expected_tag: raise ValueError('authentication failed') # The two tags match; decrypt the plaintext and return it to the caller # Note that, because ChaCha20 is a symmetric cipher, there is no difference # between the encrypt and decrypt method: here we are reusing the same # exact code used during decryption message = cipher.encrypt(ciphertext) return message

And here is how it can be used:

key = bytes.fromhex('0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef') nonce = bytes.fromhex('0123456789abcdef01234567') message = b'I wanted to go to the beach, but now I changed my mind' ciphertext, tag = chacha20poly1305_encrypt(key, nonce, message) decrypted_message = chacha20poly1305_decrypt(key, nonce, ciphertext, tag) assert message == decrypted_message print(f'ciphertext: {ciphertext.hex()}') print(f' tag: {tag.hex()}') print(f' plaintext: {decrypted_message}')

Running it produces the following output:

ciphertext: 5d9b09cc5d90ca9ddff2d3470cfd6b563c5158e952bfae6acf1ebf9a3b968a488a41969567ef5ccfe05dcf9e548567028ff374a754af tag: dac3c05d261920e278ceb22e2800aa95 plaintext: b'I wanted to go to the beach, but now I changed my mind'

This is the same output we would obtain by using the ChaCha20-Poly1305 implementation from pycryptodome directly:

from Crypto.Cipher import ChaCha20_Poly1305 key = bytes.fromhex('0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef') nonce = bytes.fromhex('0123456789abcdef01234567') message = b'I wanted to go to the beach, but now I changed my mind' cipher = ChaCha20_Poly1305.new(key=key, nonce=nonce) ciphertext, tag = cipher.encrypt_and_digest(message) print(f'ciphertext: {ciphertext.hex()}') print(f' tag: {tag.hex()}')

As already stated, it is extremely important that the nonce passed to ChaCha20-Poly1305 is unique. It may be predictable, but it must be unique. If the same nonce is reused twice or more, we can:

  • Decrypt arbitrary messages without using the secret key, if we can guess at least one message from its ciphertext.

    This can be done using the techniques described at the beginning of this article: by recovering the random bit string from the XOR of the ciphertext with the guessed message.

  • Recover the Poly1305 subkey, and, at that point, tamper with ciphertexts and forge new, valid authentication tags.

    This can be done by using algebraic methods on the polynomial of the authentication tag.

There is also a variant of ChaCha20-Poly1305, called XChaCha20-Poly1305, that features an extended 192-bit nonce (the X stands for ‘extended’). This is described in an RFC draft but so far it hasn’t been accepted as a standard yet. I won’t cover XChaCha20 in detail here, because it’s slightly more complex and does not add much to the topic of this article, but XChaCha20-Poly1305 has better security properties than ChaCha20-Poly1305, so you should prefer it in your applications if you can use it. The reason why XChaCha20-Poly1305 has better properties than ChaCha20-Poly1305 is that, having a longer nonce, the probability of generating two random nonces with the same value are much lower.

Galois/Counter Mode (GCM)

Let’s now take a look at Galois/Counter Mode (GCM). This is commonly used with the Advanced Encryption Standard (AES), to construct the authenticated cipher AES-GCM. One main difference between Poly1305 and GCM is that Poly1305 can work with any stream or block cipher, while GCM is designed to work with block ciphers with a block size of 128 bits.

GCM was proposed by David McGrew and John Viega in 2004 and is standardized in NIST Special Publication 800-38D as well as RFC 5288. It takes its name from Galois fields, also known as finite fields, which in turn get their name from the French mathematician Évariste Galois, who introduced the concept of finite fields as we know them today.

As we did before with Poly1305, we are going to first see how the keyed hash function used by GCM works, and then we will see how to use it to construct an authenticated cipher like AES-GCM on top of it. Before we can do that though, we need to understand what are finite fields, and what specific type of finite fields are used in GCM.

Finite Fields (Galois Fields)

What is a field? A field is a mathematical structure that contains a bunch of elements, and those elements can interact with each other using addition and multiplication. For both these operations there’s an identity element and an inverse element. Addition and multiplication in a field must obey the usual properties that we’re used to: commutativity, associativity, and distributivity.

A well-known example of a field is the field of fractions. Here is why fractions form a field:

  • the elements of the field are the fractions;
  • addition is well-defined: if we add two fractions, we get a fraction out (example: $5/3 + 3/2 = 19/6$);
  • multiplication is also well-defined: if we multiply two fractions, we get a fraction out (example: $1/2 \cdot 8/3 = 4/3$);
  • the additive identity element is 0: if we add 0 to any fraction, we get the same fraction back;
  • the additive inverse element is the negated fraction (example: $5/1$ is the additive inverse of $-5/1$ because $5/1 + (-5/1) = 0$);
  • the multiplicative identity element is 1: multiplying any fraction by 1 yields the same fraction back;
  • the multiplicative inverse element is what we get if we swap the numerator with the denominator (example: $3/2$ is the multiplicative inverse of $2/3$ because $3/2 \cdot 2/3 = 1$)—except for 0, which does not have a multiplicative inverse.
  • and so on…

On top of addition, multiplication, and inverse elements, we can define derived operations like subtraction and division. Subtracting $a$ from $b$ is equivalent to adding $a$ to the additive inverse of $b$: $a - b = a + (-b)$. Similarly, division can be defined in terms of multiplication with multiplicative inverses ($a / b = a b^{-1}$).

Fields are a generalization of structures where addition, multiplication, subtraction, and division behave according to the rules that we’re used to. Field elements do not necessarily need to be numbers.

An example of something that is not a field is the integers. That’s because integers don’t have multiplicative inverses (for example, there’s no integer that multiplied by 5 makes the result equal to 1). However, there is a way to turn the integers into a field: if we take the integers and a prime number p, then we can construct the field of integers modulo p.

When we work in the integers modulo a prime p, whenever we see p appear in any of our expressions, we can replace it with 0. In other words, in such a field, p and 0 are two different ways to write the same element–they are two different representations of the same element.

Here is an example: in the field of integers modulo 7, the expression 5 + 3 equals 1 because

  • 5 + 3 evaluates to 8;
  • 8, by definition, is 7 + 1;
  • if 7 and 0 are the same element, then 7 + 1 is equal to 0 + 1
  • 0 + 1 evaluates to 1

What we have just seen is that 8 is just a different representation of 1, just like 7 is a different representation of 0. Different symbols, same object. Just like, in programming languages, we can have multiple variables point to the same memory location: here the numbers are like variables, and what they point to is what really matters.

In the field of integers modulo 7, the additive inverse for 5 is 2, because 5 + 2 = 7 = 0. If we manipulate the equation, we get that 5 = −2. In other words, 5 and −2 are two different representations of the same element, and for the same reason 2 and −5 are also two different representations of the same element. A similar story holds for multiplication: the multiplicative inverse for 5 is 3 because: 5 · 3 = 15 = 7 + 7 + 1 = 1, so we can write 5 = 3−1 as well as 3 = 5−1.

What we have just seen is an example of a finite field. It’s different from a general field because it contains a finite number of elements (unlike fractions, which do not have a limit). In the case of the integers modulo 7, the number of elements is 7, and the list of elements is: {0, 1, 2, 3, 4, 5, 6}, or {−3, −2, −1, 0, 1, 2, 3}, or {0, 1, 2, 3, 2−1, 3−1, 6}, depending on what representation we like the most.

A few words about terminology, notation, and equivalences of finite fields

There can be many ways to construct a finite field (or even a general field). I have given an example using numbers, but a field does not necessarily need to be formed from numbers. We can also use vectors, matrices, polynomials, and anything you would like. As long as addition, multiplication, identity elements, and inverse elements are well-defined, you can get a field. Using programming terms, you can think of a field as an interface or a trait that can have arbitrary implementations.

An important result in algebra is that finite fields with the same number of elements are unique up to isomorphism. This means that if two finite fields the same number of elements, then there is an equivalence relation between the two. The number of elements of a field is therefore enough to define a field. It’s not enough to tell us what the elements of the field look like, or how they can be represented, but it’s enough to know how it behaves. To denote a field with $n$ elements, there are two major notations: $GF(n)$ and $\mathbb{F}_{n}$.

Another important result in algebra is that $n$ may be either a prime number, or a power of a prime. For example, we can have finite fields with 2 elements, or with 9 (= 32) elements, but we cannot have a field with 6 (= 2·3) elements. For this reason, you will often find finite fields denoted as $GF(p^k)$ or $\mathbb{F}_{p^k}$, where $p$ is a prime and $k$ is an integer greater than 0. The prime $p$ is called characteristic of the field, while $n = p^k$ is called order of the field.

Some common fields also have their own notation: in particular, the field of integers modulo a prime $p$ is denoted as $Z/pZ$. This notation encodes the “building instructions” to construct the field, in fact:

  • $Z$ denotes the integers: $Z = \{\dots, -2, -1, 0, 1, 2, \dots\}$;
  • $pZ$ denotes the integers multiplied by $p$: $pZ = \{\dots, -2p, -p, 0, p, 2p\}$ (example: $2Z = \{\dots, -4, -2, 0, 2, 4, \dots\}$);
  • $A/B$ is a quotient. This is a way to define an equivalence relation between elements, and its meaning is: within $A/B$, all the elements of $B$ are equivalent to 0. In the case of $Z/pZ$, all the multiples of $p$ are equivalent to 0, which is indeed what happens with the integers modulo $p$. The way I described this equivalence relation earlier is by saying that multiples of $p$ are different representations for 0.

Note that the integers modulo a power of a prime ($Z/p^kZ$, with $k$ greater than 1) do not form a field. The problem is that elements in $Z/p^kZ$ sometimes do not have a multiplicative inverse. For example, in $Z/4Z$, the number 2 does not have a multiplicative inverse (there is no element that multiplied by 2 gives 1). A field $GF(p^k)$ with $k$ greater than 1 needs to be constructed in a different way. One such way is to use polynomials, as described in the next section.

Polynomial fields

Let’s now move our attention from integers to polynomials, like this one:

$$x^7 + 5x^3 - 9x^2 + 2x + 1$$

Polynomials are a sum of coefficients multiplied by a variable (usually denoted by the letter x) raised to an integral power.

Let’s restrict our view to polynomials that have integer coefficients, like the one shown above. Something that is not a polynomial with integer coefficients is $1/2 x^2 + x$, because it has a fraction in it.

Integers and polynomials with integer coefficients are somewhat similar to each other. They kinda behave the same in many aspects. One important property of integers is the unique factorization theorem: if we have an integer, there’s a way to write it as a multiplication of some primary factors. For example, the integer 350 can be factored as the multiplication of 2, 5, 5, and 7.

$$350 = 7 \cdot 5 \cdot 5 \cdot 2$$

This factorization is unique: we can change the order of the factors, but it’s not possible to obtain a different set of factors (there’s no way to make the number 3 appear in the factorization of 350, or to make the number 7 disappear).

Polynomials with integer coefficients also have a unique factorization. In the case of integers, We call the unique factors “prime numbers”; in the case of polynomials we have “irreducible polynomials”. And just like we can have a field of integers modulo a prime, we can have a field of polynomials modulo an irreducible polynomial.

Integers Polynomials (with integer coefficients) Unique factorization: $42 = 7 \cdot 3 \cdot 2$ Unique factorization: $x^3 - 1 = (x^2 + x + 1)(x - 1)$ Prime numbers: 2, 3, 5, 7, 11, … Irreducible polynomials: $x + 1$, $x^2 - 2$, $x^2 + x + 1$, … Integers modulo a prime number Polynomials modulo an irreducible polynomial

Let’s take a look at how arithmetic in polynomial fields work. Let’s take, for example, the field of polynomials with integer coefficients modulo $x^3 + x + 1$, and try to compute the result of $(x^2 + 1)(x^2 + 2)$. If we expand the expression, we get:

$$(x^2 + 1)(x^2 + 2) = x^4 + 3x^2 + 2$$

This expression can be reduced. Reducing a polynomial expression is the equivalent of what we were doing with integers modulo a prime, when we were saying that 8 = 7 + 1 = 1 (mod 7). That “conversion” from 8 to 1 is the equivalent of the reduction that we’re talking about here.

To reduce $x^4 + 3x^2 + 2$, first note that $x^4 = x \cdot x^3$. Also note that $x^3 = x^3 + x + 1 - x - 1$. Here we have just added and removed the term $x + 1$: the result hasn’t changed, but now the irreducible polynomial $x^3 + x + 1$ appears in the expression, and so we can substitute it with 0. Putting everything together, we get:

$$\begin{align*} (x^2 + 1)(x^2 + 2) & = x^4 + 3x^2 + 2 \\ & = x \cdot x^3 + 3x^2 + 2 \\ & = x \cdot (x^3 + x + 1 - x - 1) + 3x^2 + 2 \\ & = x \cdot (0 - x - 1) + 3x^2 + 2 \\ & = -x^2 - x + 3x^2 + 2 \\ & = 2x^2 - x + 2 \end{align*}$$

It’s interesting to note that, if the polynomial field is formed by an irreducible polynomial with degree $n$, then all the polynomials in that field will all have degree less than $n$. That’s because if any $x^n$ (or higher) appears in a polynomial expression, then we can use the substitution trick I just showed to reduce its degree.

Binary fields

Let’s now look at polynomials where coefficients are from the field of integers modulo 2, meaning that they can be either 0 or 1. This is an example of such a polynomial:

$$x^7 + x^4 + x^2 + 1$$

or, in a more explicit form, where we can clearly see all the coefficients:

$$1 x^7 + 0 x^6 + 0 x^5 + 1 x^4 + 0 x^3 + 1 x^2 + 0 x^1 + 1 x^0$$

These are called binary polynomials. It’s interesting to note that if we ignore the variables and the powers, and keep only the coefficients, then what we get is a bit string:

$$(1 0 0 1 0 1 0 1)$$

This suggests that there’s an interesting duality between binary polynomials and bit strings. This means, in particular, that binary polynomials can be represented in a very compact and natural way on computers.

The duality between binary polynomials and bit strings also suggests that perhaps we can use bitwise operations to perform arithmetic on binary polynomials. And this turns out to be true, in fact:

  • binary polynomial addition can be computed using the XOR on the two corresponding bit strings;
  • binary polynomial multiplication can be computed using XOR, AND and bit-shifting.

Computers are pretty fast at performing these bitwise operations, and this makes binary polynomials quite attractive for use in computer algorithms and cryptography.

Arithmetic with binary polynomials

The arithmetic of such polynomials is quite interesting: in fact, because $1 + 1 = 0$ (modulo 2), then also $x^k + x^k = 0$, in fact:

$$1 \cdot x^k + 1 \cdot x^k = (1 + 1) x^k = 0 \cdot x^k = 0$$

It’s easy to see that addition modulo 2 is equivalent to the XOR binary operator. And addition of two binary polynomials is equivalent to the bitwise XOR of their corresponding bit strings:

$$\begin{array}{ccccc} (x^3 + x^2 + 1) & + & (x^2 + x) & = & x^3 + x + 1 \\ \updownarrow & & \updownarrow & & \updownarrow \\ (1101) & \oplus & (0110) & = & (1011) \end{array}$$

Multiplication of binary polynomials can also be implemented as a bitwise operation on bit strings. First, note that multiplying a polynomial by a monomial is equivalent to bit-shifting:

$$\begin{array}{ccccc} (x^3 + x + 1) & \cdot & x^2 & = & x^5 + x^3 + x^2 \\ \updownarrow & & \updownarrow & & \updownarrow \\ (1011) & \ll & 2 & = & (101100) \end{array}$$

Then note that multiplication of two polynomials can be expressed as the sum of multiplications by monomials:

$$(x^3 + 1)(x^2 + x + 1) = (x^3 + 1) \cdot x^2 + (x^3 + 1) \cdot x^1 + (x^3 + 1) \cdot x^0$$

Putting everything together, we have multiplications by monomials (equivalent to bit-shifts) and sums (equivalent to bitwise XOR). This suggests that multiplication can be implemented on top of bitwise XOR and bit-shifting.

Here is some Python code to implement binary polynomial multiplication, where each polynomial is represented compactly as an int:

def multiply(a, b): """ Compute a*b, where a and b are two integers representing binary polynomials. a and b are expected to have their most significant bit set to the monomial with the highest power. For example, the polynomial x^8 is represented as the integer 0b10000. """ assert a >= 0 assert b >= 0 result = 0 while b: result ^= a * (b & 1) a <<= 1 b >>= 1 return result

Other than XOR and bit-shifting, this code also uses AND to “query” whether a certain monomial is present or not.

Here is an example of how to use the code:

a = 0b0101_0111 # x^6 + x^4 + x^2 + x + 1 b = 0b0001_1010 # x^4 + x^3 + x c = multiply(a, b) assert c == 0b0111_0110_0110 # x^10 + x^9 + x^8 + x^6 + x^5 + x^2 + x

Now that we have introduced binary polynomials, we can of course form binary polynomials modulo a binary irreducible polynomial. These form a finite field, which is more concisely called: binary field.

Note that in a binary field where the modulo is an irreducible polynomial of degree $n$, all polynomials in the field can be represented as $n$-bit strings, and all $n$-bit strings have a corresponding binary polynomial in the field.

Arithmetic in binary fields

If we have three integers $a$, $b$, and $p$, we can compute $(a + b) \bmod{p}$ or $a \cdot b \bmod{p}$ by performing the binary operation (addition or multiplication) and then taking the remainder of the division by $p$. This is a method that returns the results of addition or multiplication using a representation with the lowest number of digits possible.

What if instead of having 3 integers we have three binary polynomials $A$, $B$, and $P$ and we want to compute $(A + B) \bmod{P}$ or $A \cdot B \bmod{P}$? It turns out that these operations can be implemented with code that is even easier than the integer counterpart: no division needs to be involved!

Let’s start with addition: we have already seen that addition with binary polynomials can be implemented with a simple XOR operation. This means that if the degree of $A$ and $B$ is lower than the degree of $P$, then the result of $A + B$ is also going to have degree less than $P$, hence no reduction is needed. We can use the result as-is, without any transformation: adding two binary field elements can be implemented with a single XOR operation.

With multiplication the story is different: the product $A \cdot B$ may have degree equal to or higher than $P$. For example, if $A = B = x$ and $P = x^2 + 1$, the product $A \cdot B$ is equal to $x^2$, which has the same degree as $P$. We need to find a way to efficiently reduce the higher-degree terms of this product. To see one way to do that, note that we can write $P$ like this:

$$P = x^n + Q$$

where $n$ is the degree of $P$ (the maximum power of $P$) and $Q$ is another binary polynomial, with degree strictly lower than $n$. Rearranging the equation, we get:

$$x^n = P + Q$$

Note that subtraction and addition are the same operations in a binary field. Because $P$ equals 0, we can write:

$$x^n = Q$$

This equivalence gives us a way to eliminate higher-level terms that appear during multiplication: whenever we see an $x^n$ appearing in the result, we can remove that term and add $Q$ instead. One way to do that, using binary strings, is to discard the highest bit (the one corresponding to $x^n$) and XOR with the binary string corresponding to $Q$.

Another way to do it is to just add $P$ (XOR by the binary string corresponding to $P$). This is equivalent to adding 0, results in the more compact representation that we’re interested in.

We could use similar tricks to eliminate terms like $x^{n+1}$, but these tricks are not necessary if we eliminate $x^n$ terms as soon as they appear in an iterative way.

Here is some Python code for multiplication in binary fields that uses the “add $P$” trick just described:

def multiply(a, b, p): """ Compute a*b modulo p, where a, b and c are three integers representing binary polynomials. a, b and p are expected to have their most significant bit set to the highest power monomial. For example, the polynomial x^8 is represented as 0b10000. """ bit_length = p.bit_length() - 1 assert a >= 0 and a < (1 << bit_length) assert b >= 0 and b < (1 << bit_length) result = 0 for i in range(bit_length): result ^= a * (b & 1) a <<= 1 a ^= p * ((a >> bit_length) & 1) b >>= 1 return result

This code is essentially the same as the binary polynomial multiplication code we had before, except for this line in the for loop:

a ^= p * ((a >> bit_length) & 1)

This line is what “adds $P$” whenever adding the shifted $A$ would result in a $x^n$ term to appear.

Again, we achieved implementing multiplication using only XOR, AND and bit-shifting.

Note that the binary polynomial $P$ here does not necessarily need to be an irreducible polynomial for this algorithm to work. However, the resulting algebraic structure won’t be a field unless $P$ is irreducible. A similar story holds for integers: we can have integers modulo a non-prime number, but that’s not a field.

The GHASH keyed hash function

GCM uses a binary field. The irreducible binary polynomial that defines the binary field used by GCM is:

$$P = x^{128} + x^7 + x^2 + x + 1$$

We will call this field the GCM field. Note that this polynomial has degree 128, hence the GCM field elements can be represented as 128-bit strings, and each 128-bit string has a corresponding element in the GCM field.

The keyed hash function used by GCM is called GHASH and takes as input a 128-bit key. We will call this key $H$. This key is interpreted as an element of the GCM field.

The message to authenticate is split into blocks of 128 bits each: $M_1$, $M_2$, $M_3$, … $M_n$. If the length of the message is not a multiple of 128 bits, then the last block is padded with zeros. Each block of message is also interpreted as an element of the GCM field.

Here is how the authentication tag is computed from $H$ and the padded message blocks $M_1$, …, $M_n$:

  • The initial state (a GCM field element) is initialized to 0: $A_0 = 0$.
  • For every block of message $M_i$, the next state $A_i$ is computed as $A_i = (A_{i-1} + M_i) \cdot H \bmod{P}$.
  • The final state $A_n$ is returned as a 128-bit string.

What this function is doing is computing the following polynomial in $H$:

$$\begin{align*} Tag & = (((M_1 \cdot H + M_2) \cdot H + \cdots M_n) \cdot H) \bmod{P} \\ & = (M_1 H^n + M_2 H^{n-1} + \cdots M_n H) \bmod{P} \end{align*}$$

This construction is somewhat similar to the one from Poly1305, although there are important differences:

  • In Poly1305, the elements of the tag polynomial are integers modulo a prime, in GHASH they are elements of a binary field.
  • GHASH does not perform any step to encode the length of the message, hence the tag for an empty message will be the same as the tag for a sequence of zero blocks. We will see later that GCM fixes this problem by appending the length of the message to the end of the input passed to GHASH.
  • Most importantly, the final $Tag$ polynomial is a polynomial in one unknown, and as such $H$ may be easily recoverable using algebraic methods. For this reason, GHASH is not suitable as a secure one-time authenticator. We will see that GCM fixes this problem by encrypting the output of GHASH.
Use of GCM with AES (AES-GCM)

GCM is the combination of a block cipher, Counter Mode (CTR), and the GHASH function that we have just seen. The block cipher is often AES. When we combine AES with GCM, the what we get is AES-GCM, which is described below. However the block cipher does not necessarily need to be AES: what is important is that the block size of the cipher is 128 bits, and that’s because GHASH only works on 128-bit blocks.

The inputs to the AES-GCM encryption function are:

  • a secret key (the length of the key depends on the variant of AES used: if AES-128, this will be 128 bits);
  • a 96-bit nonce;
  • a variable-length plaintext message.

The outputs of the AES-GCM encryption function are:

  • a variable-length ciphertext (same length as the input plaintext);
  • a 128-bit authentication tag.

The AES-GCM decryption function will accept the same secret key, nonce, ciphertext, and authentication tag as the input, and produce either the plaintext or an error as the output. The error is returned in case the authentication fails.

Data flow during an AES-GCM encryption. This shows the inputs in blue, the outputs in green, and the intermediate objects in red.

AES-GCM works in the following way:

  1. The GHASH subkey $H$ is generated by encrypting a zero-block: $H = \operatorname{Encrypt}(key, \underbrace{000\dots0}_\text{128 bits})$.

  2. The block cipher AES is initialized in Counter Mode (AES-CTR) with the key, the nonce, and a 32-bit, big-endian counter starting at 2.

  3. The plaintext is encrypted using the instance of AES-CTR just created.

  4. The GHASH function is run with the following inputs:

    • the subkey $H$, computed in step 1;
    • the ciphertext padded with zeros to make its length a multiple of 16 bytes (128 bits), concatenated to the length (in bits) of the ciphertext represented as a 128-bit big-endian integer.

    The result is a 128-bit block $S = \operatorname{GHASH}(H, ciphertext || padding || length)$.

  5. The AES-CTR counter is set to 1.

  6. The block $S$ is then encrypted using AES-CTR. The result of the encryption is the authentication tag.

    Note that, because $S$ matches the block size of the cipher, this encryption won’t cause the counter value 2 to be reused.

  7. The ciphertext and authentication tag are returned.

Here is how AES-GCM and GHASH can be implemented in Python, using the AES implementation from pycryptodome (usual disclaimer: this code is for educational purposes, and it’s not necessarily secure or optimized for performance):

from Crypto.Cipher import AES def multiply(a: int, b: int) -> int: """ Compute a*b in the GCM field, where a and b are two integers representing elements of the GCM field. a and b are expected to have their least significant bit set to the highest power monomial. For example, the polynomial x^125 is represented as 0b100. """ bit_length = 128 q = 0xe1000000000000000000000000000000 assert a >= 0 and a < (1 << bit_length) assert b >= 0 and b < (1 << bit_length) result = 0 for i in range(bit_length): result ^= a * ((b >> 127) & 1) a = (a >> 1) ^ (q * (a & 1)) b <<= 1 return result def pad_block(data: bytes) -> bytes: """ Pad data with zero bytes so that the resulting length is a multiple of 16 bytes (128 bits). """ if len(data) % 16 != 0: data += b'\0' * (16 - len(data) % 16) return data def iter_blocks_padded(data: bytes): """ Split the given data into blocks of 16 bytes (128 bits) each, padding the last block with zeros if necessary. """ start = 0 while start < len(data): yield pad_block(data[start:start+16]) start += 16 def ghash(subkey: bytes, message: bytes) -> bytes: subkey = int.from_bytes(subkey, 'big') assert subkey < (1 << 128) state = 0 for block in iter_blocks_padded(message): block = int.from_bytes(block, 'big') state = multiply(state ^ block, subkey) return state.to_bytes(16, 'big') def aes_gcm_encrypt(key: bytes, nonce: bytes, message: bytes): assert len(key) in (16, 24, 32) assert len(nonce) == 12 # Initialize a raw AES instance and encrypt a 16-byte block of all zeros to # derive the GHASH subkey H cipher = AES.new(mode=AES.MODE_ECB, key=key) h = cipher.encrypt(b'\0' * 16) # Encrypt the message with AES in CTR mode, with the counter composed by # the concatenation of the 12 byte (96 bits) nonce and a 4 byte (32 bits) # integer, starting from 2 cipher = AES.new(mode=AES.MODE_CTR, key=key, nonce=nonce, initial_value=2) ciphertext = cipher.encrypt(message) # Compute the GHASH of the ciphertext plus the ciphertext length in bits s = ghash(h, pad_block(ciphertext) + (len(ciphertext) * 8).to_bytes(16, 'big')) # Encrypt the GHASH value using AES in CTR mode, with the counter composed # by the concatenation of the 12 byte (96 bits) nonce and a 4 byte (32 # bits) integer set at 1. The GHASH value fits in one block, so the counter # won't be increased during this round of encryption cipher = AES.new(mode=AES.MODE_CTR, key=key, nonce=nonce, initial_value=1) tag = cipher.encrypt(s) return (ciphertext, tag) def aes_gcm_decrypt(key: bytes, nonce: bytes, ciphertext: bytes, tag: bytes): assert len(key) in (16, 24, 32) assert len(nonce) == 12 assert len(tag) == 16 # Compute the GHASH subkey, the GHASH value, and the authentication tag, in # the same exact way as it was done during encryption cipher = AES.new(mode=AES.MODE_ECB, key=key) h = cipher.encrypt(b'\0' * 16) s = ghash(h, pad_block(ciphertext) + (len(ciphertext) * 8).to_bytes(16, 'big')) cipher = AES.new(mode=AES.MODE_CTR, key=key, nonce=nonce, initial_value=1) expected_tag = cipher.encrypt(s) # Compare the input tag with the generated tag. If they're different, the # plaintext must not be returned to the caller if tag != expected_tag: raise ValueError('authentication failed') # The two tags match; decrypt the plaintext and return it to the caller. # Note that, because AES-CTR is a symmetric cipher, there is no difference # between the encrypt and decrypt method: here we are reusing the same # exact code used during decryption cipher = AES.new(mode=AES.MODE_CTR, key=key, nonce=nonce, initial_value=2) message = cipher.encrypt(ciphertext) return message

And here is how the code can be used:

key = bytes.fromhex('0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef') nonce = bytes.fromhex('0123456789abcdef01234567') message = b'I went to the zoo yesterday but not today' ciphertext, tag = aes_gcm_encrypt(key, nonce, message) print(f'ciphertext: {ciphertext.hex()}') print(f' tag: {tag.hex()}') decrypted_message = aes_gcm_decrypt(key, nonce, ciphertext, tag) print(f' plaintext: {decrypted_message}') assert message == decrypted_message

This snippet produces the following output:

ciphertext: e0c32db2962f9b729c69028d9a1fdfb2c93839fc1188f314c58ee97fd6a242404953bb208df609a33c tag: 9fa6fe2f77a0c98282868924ace0e4ec plaintext: b'I went to the zoo yesterday but not today'

This is the same output we would obtain by using the AES-GCM implementation from pycryptodome directly:

from Crypto.Cipher import AES cipher = AES.new(mode=AES.MODE_GCM, key=key, nonce=nonce) ciphertext, tag = cipher.encrypt_and_digest(message) print(f'ciphertext: {ciphertext.hex()}') print(f' tag: {tag.hex()}')

Nonce reuse is catastrophic for AES-GCM in two ways:

  • Because the ciphertext produced by AES-GCM is just a variant of AES-CTR, nonce reuse with GCM can have the same consequences as nonce reuse with AES-CTR, or any other stream cipher: if someone is able to guess the plaintext, they can recover the random stream, and use that to decrypt other messages (or portions of them).

  • If the same nonce is used twice or more, the GHASH subkey $H$ will always be the same. Even if the output of GHASH is encrypted in step 7, we can use the XOR of two authentication tags to “cancel” the encryption and obtain a polynomial in $H$. From there, we can use algebraic methods to recover $H$. This gives us the ability to forge new, valid authentication tags.

It’s worth mentioning that there’s a variant of AES-GCM, called AES-GCM-SIV, (Synthetic Initialization Vector) specified in RFC 8452. This differs from AES-GCM in that it uses a little-endian version of GHASH called POLYVAL (which is faster on modern CPUs), and in that it allows nonce reuse without the two catastrophic consequences that I mentioned above.

(Nonce reuse with AES-GCM-SIV however still presents a problem, just not as serious as the two ones above: specifically, it breaks ciphertext indistinguishability.)

Authenticated Encryption with Associated Data (AEAD)

The way I have described authenticated encryption, and in particular the constructions ChaCha20-Poly1305 and AES-GCM, is accurate, but incomplete. What I have told you is that when you use an authenticated encryption cipher, the ciphertext is checked for integrity and authenticity. But we can use the same technique to authenticate anything, not just ciphertexts: we can, for example, authenticate some plaintext data, or authenticate a piece of plaintext data and a piece of ciphertext altogether.

When we use a method to authenticate a plaintext message only, what we get is a Message Authentication Code (MAC). We don’t use the word “encryption” in this context, because the confidentiality of the message is not ensured (only its authenticity).

When we use a method to authenticate both a ciphertext and a plaintext message, what we get is Authenticated Encryption with Associated Data (AEAD). In this construction, there are two messages involved: one to be encrypted (resulting in a ciphertext), and one to be kept in plaintext. The plaintext message is called “associated data” (AD) or “additional authenticated data” (AAD). Both the ciphertext and the associated data are authenticated at encryption time, so their integrity and authenticity will be enforced.

The inputs to the encryption function of an AEAD cipher are, generally speaking:

  • a key;
  • a nonce;
  • the additional data;
  • the message to encrypt.

The outputs of the encryption are:

  • the ciphertext;
  • the authentication tag.

Note that there’s only one authentication tag that covers both the additional data and the ciphertext.

The inputs to the decryption function are:

  • the key used for encryption;
  • the nonce used for encryption;
  • the additional data used for encryption;
  • the ciphertext.

And the output of the decryption is either an error or the decrypted message.

It’s important to note that the associated data must be both at encryption time and decryption time. Changing a single bit of it will make the entire decryption operation fail.

Both ChaCha20-Poly1305 and AES-GCM (and their variants, XChaCha20-Poly1305 and AES-GCM-SIV) are AEAD ciphers. Here’s how they implement AEAD:

  • When the Poly1305 or GHASH authenticator is first initialized, they are fed the additional data, padded with zeros to make its size a multiple of 16 bytes (128 bits).
  • Then the padded ciphertext is fed into the authenticator.
  • The length of the additional data and the length of the ciphertext are represented as two 64-bit integers, concatenated, and fed into the authenticator.
Updated data flow during a ChaCha20-Poly1305 encryption which shows where the Associated Data (AE) is placed. Updated data flow during an AES-GCM encryption which shows where the Associated Data (AE) is placed.

If the additional data is empty, then what you get are exactly the constructions that I described earlier in this article.

Authenticated Encryption with Associated Data is useful in situations where you want to encode some metadata along with your encrypted data. For example: an identifier for the resource that is encrypted, or the type of data encrypted (text, image, video, …), or some information that indicates what key and algorithm was used to encrypt the resource, or maybe the expiration of the data. The associated data is in plaintext so systems that do not have access to the secret key can gather some properties about the encrypted resource. It must however be understood that the associated data cannot be trusted until it’s verified using the secret key. Systems that analyze the associated data must be designed in such a way that, if the associated data is tampered, nothing bad will happen, and such tampering attempt will be detected sooner or later.

A word of caution

Something very important to understand is that when using authenticated encryption ciphers like ChaCha20-Poly1305 or AES-GCM, decryption can in theory succeed even if the verification of the authentication tag fails.

For example, we can decrypt a ciphertext encrypted with ChaCha20-Poly1305 by using ChaCha20 and ignoring the authentication tag. Similarly, we can decrypt a ciphertext encrypted with AES-GCM by using AES-CTR and, again, ignoring the authentication tag. This possibility opens the doors to all the nasty scenarios that we have seen at the beginning of this article, removing all the benefits of authenticated encryption.

Perhaps the most important thing to remember when using authenticated encryption is: never use decrypted data until you have verified its authenticity.

Why am I emphasizing this? Because some AE or AEAD implementations do return plaintext bytes before verifying their authenticity.

The code samples that I have provided do the following: they first calculate the authentication tag, compare it to the input tag, and only if the comparison succeeds they perform the decryption. This is a simple approach, but it may be expensive when encrypting large amounts of data (for example: several gigabytes). The reason why this approach is expensive is that, if the ciphertext is too large, it may not fit all in memory, and the ciphertext would have to be read from the storage device twice: once for calculating the tag, and once for decrypting the ciphertext. Also, chances are that by the time the application has computed the tag, the underlying ciphertext may have changed without detection.

What some authenticated encryption implementations do when dealing with large amounts of data is that they calculate the tag and perform the decryption in parallel. They read the ciphertext chunk-by-chunk, and pass each chunk to both the authenticator and the decryption function, returning a chunk of decrypted bytes to the caller at each iteration. Only at the end, when the full ciphertext has been read, the authenticity is checked, and the application may return an error only at that point. With such implementations, it is imperative that the exit status of the application is checked before using any of the decrypted bytes.

An implementation that works like that (returning decrypted bytes before authentication is complete) is GPG. Here is an example of the output that GPG produces when decrypting a tampered message:

gpg: AES256.CFB encrypted data gpg: encrypted with 1 passphrase This is a very long message. gpg: WARNING: encrypted message has been manipulated!

The decrypted message (“This is a very long message”) got printed, together with a warning, and the exit status is 2, indicating that an error occurred. It is important in this case that the decrypted message is not used in any way.

Other implementations avoid this problem by simply not encrypting large amounts of data. If given a large file to encrypt, the file is first split into multiple chunks of a few KiB, then each chunk is encrypted independently, with its own nonce and authentication tag. Because each chunk is small, authentication and decryption can happen in memory, one before the other. If a chunk was tampered, decryption would stop, returning truncated output, but never tampered output. It’s still important to check the exit status of such an implementation, but the consequences are less catastrophic than before. The drawback of this approach is that the total size of the ciphertext increases, because each chunk requires a nonce, an authentication tag, and some information about the position of the chunk (to prevent the chunks from being reordered). Storing the nonces or the positions can be avoided by using an algorithm to generate them on the fly, but storing the tag cannot be avoided.

The method of splitting that I have just described (of splitting long messages into chunks that are individually encrypted and authenticated) is used for example in TLS, as well as the command line tool AGE.

Summary and final considerations

At the beginning of this article we have seen some risks of using bare encryption ciphers: one of them in particular was malleability, that is: the property that ciphertexts may be modified without detection.

This problem was addressed by using Authenticated Encryption (AE) or Authenticated Encryption with Associated Data (AEAD), which are methods to provide integrity and authenticity in addition to confidentiality when encrypting data.

We have seen the details of the two most popular authenticated encryption ciphers and briefly mentioned some of their variants. Their features are summarized here:

Cipher Cipher Type Key Size Nonce Size Nonce Reuse Tag Size ChaCha20-Poly1305 Stream, AEAD 256 bits 96 bits Catastrophic 128 bits XChaCha20-Poly1305 Stream, AEAD 256 bits 192 bits Catastrophic 128 bits AES-GCM Stream, AEAD 128, 192, 256 bits 96 bits Catastrophic 128 bits AES-GCM-SIV Stream, AEAD 128 or 256 bits 96 bits Reduced risk 128 bits

Authenticated encryption is used in most of our modern protocols, including TLS, S/MIME, PGP/GPG, and many more. Failure to implement authenticated encryption correctly has lead to some serious issues in the past.

Whenever you’re using encryption, ask yourself: how is integrity and authentication verified? And remember: it’s essential to verify the authenticity of data before using it.

I hope you enjoyed this article! As usual, if you have any suggestions or spotted some mistakes, let me know in the comments or by contacting me!

Aaron Rainbolt: A brief history of my experience with Linux

Enj, 09/03/2023 - 12:21md

From an early age, I’ve been very interested in computers. They were fascinating and profoundly powerful devices, capable of doing things that were (and in some ways still are) beyond my understanding. Like many of us, I started on Windows, and like many of us who started out computing when we were kids, I did a lot of damage to my first computer. I still haven’t grown out of breaking computers, I’ve just gotten a lot better at planning how I’m going to break them and what to do in order to fix them after the fact.

Fast forward several years. I’ve heard of Linux, read books about how to do some things like web development on Linux, and am thinking about trying Linux. I wanted to do music recording at the time too (something I’m still planning on doing). I didn’t have a working Internet connection, so I got in the car with my friend and went to a place that had Internet in order to download some Linux distros to try out.

Thanks for reading Arraybolt's Archives! Subscribe for free to receive new posts and support my work.

The first distro I ever used was HP QuickWeb (Splashtop OS), which came with my Windows 7 laptop. I knew nothing about Linux at the time, I just found this QuickWeb thing included with my computer and decided to try it out in order to keep things isolated from my Windows installation. (In retrospect, this probably didn’t work at all, but I didn’t know that…) Basically it was nothing more than a Linux kernel, a simple and glitchy desktop, and a severely outdated copy of Firefox that I didn’t know better than to use. From this insecure and buggy mess, my journey into Linux began.

After fighting with QuickWeb’s browser for a bit, I learned about a distro called KXStudio, which was supposed to be geared towards music production. (KXStudio has since switched from being a distro to being a third-party repo that you use on top of Kubuntu, which severely confused me, since I didn’t know the difference between a repo and a distro at the time.) After a lot of searching, I managed to find KXStudio 14.04, from back when KXStudio was still a distro. I downloaded it, along with a few other distros that I ultimately didn’t end up getting to work, and also grabbed VirtualBox so that I could run the distros in a VM.

Let’s just say that combining a buggy distro with a total newbie resulted in chaos.

I went through quite a few “stages” learning how Linux worked, everything from “Where on earth is my C: drive?” to “Great, I just broke my audio, now what?”. It was pretty interesting. I also had to deal with the glitchiness and complexity of VirtualBox, as well as the fact that the distro I was using was not exactly newbie-friendly. The whole thing was a total mess. A very fun total mess. Before long, I had decided to install Linux on some physical hardware that I had available (which led to even more chaos, especially when UEFI came into the picture), and after days of headaches, many reinstalls, and tons of breakage, I was finally getting good enough at Linux usage to start using it as my daily driver on three different computers (a desktop and two laptops).

My third distro was ChaletOS 16.04. ChaletOS was essentially a modded Xubuntu variant made to look and feel a lot like Windows. It also has the Wine Windows Compatibility Layer preinstalled, making it compatible with a lot of Windows software out of the box. I didn’t use Wine very much - I was more interested in the beautiful UI, mostly smooth operation, and powerful features it provided. I was starting to get into more advanced Linux usage, and was for the first time using a sorta-halfway supported OS (all the ones I had used up until then were already EOL or very nearly EOL, if I remember correctly). I also had a working Internet connection, which greatly improved my ability to download software and research things. At this point I was almost entirely “sold” on Linux, and was only still using Windows on one laptop. That one laptop was important because it had a development environment on it that ran on Windows 7 inside a Hyper-V virtual machine inside Windows 8.0 Pro. I was reluctant to switch to Linux there because I knew Hyper-V wouldn’t run on Linux and I didn’t want to mess up the Windows 7 VM’s activation status by moving from Hyper-V to QEMU.

While using ChaletOS on my other laptop (not the one with the important development environment), I tried a bunch of different distros in VMs, including really weird ones like BasicLinux (don’t ever use it, it’s a nightmare) and Calculate Linux (Gentoo derivative, that was a total fail because hilariously I couldn’t figure out what the package manager’s name was). And then, in the midst of my distro-hopping adventures, I stumbled across two distros that almost immediately became my favorites. It was around the time of April 2020. Ubuntu 20.04 and its flavors has just been released recently. And it was during that time I discovered Kubuntu and Lubuntu.

I can’t remember which one I tried first, but I remember quite well some of my initial experiments with them. After playing with both Kubuntu and Lubuntu in VMs, I decided to install Lubuntu onto a flash drive and boot my non-development laptop from it. The extremely smooth experience Lubuntu 20.04 provided, the beautiful and simple UI, everything was almost perfect. Kubuntu was really nice, but it was a bit heavier and the laptop I was using for testing only had 4 GB RAM and no internal hard drive (it always booted from a USB stick, or from an SD card using a hack), so a lightweight OS was important to me. I used that OS a lot, and even booted my important development laptop from my Lubuntu USB stick to see what would happen (it worked quite well).

While I ended up loving Lubuntu, I was somewhat partial to Kubuntu since it was the basis of KXStudio, and the extreme power of the KDE desktop was a serious plus that Lubuntu’s LXQt desktop didn’t have. I had ended up having a slight mess with my Windows 7 VM on the development laptop which resulted in me having to re-activate it (long story short, don’t expect to restore a Hyper-V VM from a backup of only the .vhd files without the rest of the VM config!), and as a result, I was no longer so scared of having to move my VM to Linux using QEMU. I took the plunge and wiped Windows 8 with Kubuntu 20.04 on my main development laptop, then got my Windows 7 VM set up inside QEMU (where it actually worked better than it did on Hyper-V). Everything was working exactly as I had hoped (except for one small hiccup that resulted in an OS reinstall due to a combination of inexperience and paranoia, but that was overcome without too much trouble in the long run).

After a long bit of life lived in 20.04-land (and a brief stint with Lubuntu and Ubuntu Studio 21.10), Ubuntu 22.04 was released. I had finally mostly gotten a grip on how Ubuntu and Linux really worked, and was interested in trying to help with Ubuntu’s development. I had looked into helping in the past, but hadn’t actually gone ahead and tried to help. I joined the #lubuntu-devel channel with the nick “Guest3” and asked about the possibility of writing software documentation for Lubuntu. I surprisingly got replied to almost immediately by two members of the Lubuntu team, switched my IRC nick from Guest3 to arraybolt3, and before long was helping with testing, documentation writing, and debugging. (It all just kinda happened - I was there and stuff was happening and I ended up getting to try to help and it all worked really well.) I ended up helping with the development of Lubuntu and a couple of the other Ubuntu flavors during the Kinetic Kudu development cycle and the 22.04.1 release, and six hectic months of excitement later, Ubuntu 22.10 and its flavors were released.

That first step into the Ubuntu development world was about 10 months ago. We are now nearing the release of Ubuntu 23.04, and have since released 20.04.5 and 22.04.2. The joy of getting to contribute to an OS I’ve been using for years has been amazing, and I look forward to continuing to contribute in the future.

That was about four years of computing packed into a short article. Hope you found this enjoyable and helpful, and see you later!

Thanks for reading Arraybolt's Archives! Subscribe for free to receive new posts and support my work.

Serge Hallyn: It’s like…

Mar, 07/03/2023 - 10:46md

Let’s say you had one of those huge keyrings, with 50 keys on it. (Let’s say 41 keys – number chosen either randomly or bc its the current number of capabilities) On the plus side, when you want to let someone use one, you can magically, instantly make a copy to give to that person. On the down side, that person can do the same (give a copy to *anyone*), once you hand it to them. Further on the down side, well, you have 41 keys.

Now if you want to send the valet to take your car out, you have to give him a key to the elevator, key to the garage entry door, key to the car door, key to the car ignition, and a key to the garage-exit. You’ve done this yourself plenty so you know what keys to give.

But other parts of the building were designed by other people, who have placed gates in various places you’ve never visited, and know nothing about. So now you need the maid to go clean the fifth bedroom. How do you know what keys to give her?

You can start her on her way with the key to the elevator and the key to the bedroom. But she needed a key to pass a gate in the first hallway. So she comes back… all she knows to say is EPERM! She can’t even tell you where the gate is. So you walk with her to the first gate (strace), see the gate, figure out which key it needs, give it to her. Send her on her way.

Of course she hits another gate, comes back, and says EPERM!

Now, you’re not a quitter. And this is important. So you’ll keep trying until you figure out all the keys she needs. Unfortunately, there are 30 other people – chauffeurs, mechanics, janitors, butlers, maids, some delivery people, a gardener… They all need access to various places.

It gets worse. Some gates are only locked at certain times. So you think you’ve gotten them sorted out, but at 2am it turns out half of them are newly blocked. Oh, and next week the person in charge of the second floor decides a few more gates are really needed. Hopefully their newsletter will tell you where, and which keys they’ll use – but they’re busy, so probably not.

So, sure, you’re not usually a quitter, but it sure will be temping to give the full 41 keys to anyone who comes by.

Sean Davis: Xubuntu Minimal Visual Tour

Sht, 04/03/2023 - 12:14md

Xubuntu Minimal provides a lighter-weight alternative to the full-size Xubuntu desktop. The core desktop experience allows you to build your very own version of Xubuntu with the tricky configuration already done for you. Scroll through the screenshots below to see Xubuntu Minimal in action. When you&aposre done, download Xubuntu Minimal from cdimage.ubuntu.com to test it for yourself!

The Xubuntu Minimal desktop looks identical to the standard Xubuntu desktop.The application menu is very empty, featuring only the Accessories, Settings, and System categories.While browsing all applications (continued in the next two screenshots), you&aposll find only the essentials.The Mail Reader and Web Browser apps won&apost do much until you&aposve added your own.So, fire up the Xfce Terminal and apt install or snap install your favorites. NetworkManager is still included, so you can configure your network settings.The notification plugin is here, no loss of functionality.The PulseAudio plugin is emptier than normal, since there are no audio apps installed. Note: Xubuntu Minimal ships with PipeWire but is still controlled by the PulseAudio plugin.The clock is here to remind you of the time you saved while installing a much smaller Ubuntu flavor.The Settings Manager has a few less settings than normal, but is still fully functional.Here&aposs the bottom half of your available settings.No visual tour would be complete without a neofetch screenshot. You&aposre welcome.

Utkarsh Gupta: FOSS Activites in February 2023

Mar, 28/02/2023 - 6:41pd

Here’s my (forty-first) monthly but brief update about the activities I’ve done in the F/L/OSS world.

Debian

This was my 50th month of actively contributing to Debian. I became a DM in late March 2019 and a DD on Christmas ‘19! \o/

There’s a bunch of things I do, both, technical and non-technical. Here are the things I did this month:

Uploads Others
  • Looked up some Release team documentation.
  • Sponsored php-font-lib and php-dompdf-svg-lib for William.
  • Granted DM rights for php-dompdf.
  • Mentoring for newcomers.
  • Reviewed micro bits for Nilesh, new uploads and changes.
  • Ruby sprints.
  • Bug work (on BTS and #debian-ruby) for rails and redmine.
  • Moderation of -project mailing list.

A huge thanks to Freexian for sponsoring my Debian work and Entrouvert for sponsoring the Redmine backports. :D

Ubuntu

This was my 25th month of actively contributing to Ubuntu. Now that I joined Canonical to work on Ubuntu full-time, there’s a bunch of things I do! \o/

I mostly worked on different things, I guess.

I was too lazy to maintain a list of things I worked on so there’s no concrete list atm. Maybe I’ll get back to this section later or will start to list stuff from the fall, as I was doing before. :D

Debian (E)LTS

Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success.

And Debian Extended LTS (ELTS) is its sister project, extending support to the stretch and jessie release (+2 years after LTS support).

This was my forty-first month as a Debian LTS and thirty-second month as a Debian ELTS paid contributor.
I worked for 24.25 hours for LTS and 28.50 hours for ELTS.

LTS CVE Fixes and Announcements:
  • Fixed CVE-2022-47016 for tmux and uploaded to buster via 2.8-3+deb10u1.
    But decided to not roll the DLA for the package as the CVE got rejected upstream.
  • Issued DLA 3359-1, fixing CVE-2019-13038 and CVE-2021-3639, for libapache2-mod-auth-mellon.
    For Debian 10 buster, these problems have been fixed in version 0.14.2-1+deb10u1.
  • Issued DLA 3360-1, fixing CVE-2021-30151 and CVE-2022-23837, for ruby-sidekiq.
    For Debian 10 buster, these problems have been fixed in version 5.2.3+dfsg-1+deb10u1.
  • Worked on ruby-rails-html-sanitize and added notes to the security-tracker.
    TL;DR: we need newer methods in ruby-loofah to make the patches for ruby-rails-html-sanitize backportable.
  • Started to look at other set of packages meanwhile.
ELTS CVE Fixes and Announcements:
  • Issued ELA 813-1, fixing CVE-2017-12618 and CVE-2022-25147, for apr-util.
    For Debian 8 jessie, these problems have been fixed in version 1.5.4-1+deb8u1.
    For Debian 9 stretch, these problems have been fixed in version 1.5.4-3+deb9u1.
  • Issued ELA 814-1, fixing CVE-2022-39286, for jupyter-core.
    For Debian 9 stretch, these problems have been fixed in version 4.2.1-1+deb9u1.
  • Issued ELA 815-1, fixing CVE-2022-44792 and CVE-2022-44793, for net-snmp.
    For Debian 8 jessie, these problems have been fixed in version 5.7.2.1+dfsg-1+deb8u6.
    For Debian 9 stretch, these problems have been fixed in version 5.7.3+dfsg-1.7+deb9u5.
  • Helped facilitate RabbitMQ’s update queries by one of our customers.
  • Started to look at other set of packages meanwhile.
Other (E)LTS Work:

Until next time.
:wq for today.

Salih Emin: ucaresystem core 4.4.0 : Now published for Ubuntu Kinetik and Lunar

Pre, 17/02/2023 - 5:24md
The new release of ucaresystem core is released and available for Ubuntu Kinetik (22.10) and Ubuntu Lunar (23.04) The all time classic uCareSystem is now release for Ubuntu Kinetik (22.10) and Ubuntu Lunar (23.04). Also, in the same repository you will find an app that you can use to quickly fetch info about your OSContinue reading "ucaresystem core 4.4.0 : Now published for Ubuntu Kinetik and Lunar"

Lukas Märdian: Netplan v0.106 is now available

Mër, 15/02/2023 - 2:41md

I’m happy to announce that Netplan version 0.106 is now available on GitHub and is soon to be deployed into an Ubuntu/Debian/Fedora installation near you! Six months and 65 commits after the previous version, this release is brought to you by 4 free software contributors from around the globe.

Highlights

Highlights of this release include the new netplan status command, which queries your system for IP addresses, routes, DNS information, etc… in addition to the Netplan backend renderer (NetworkManager/networkd) in use and the relevant Netplan YAML configuration ID. It displays all this in a nicely formatted way (or alternatively in machine readable YAML/JSON format).

Furthermore, we implemented a clean libnetplan API which can be used by external tools to parse Netplan configuration, migrated away from non-inclusive language (PR#303) and improved the overall Netplan documentation. Another change that should be noted, is that the match.macaddress stanza now only matches on PermanentMACAddress= on the systemd-networkd backend, as has been the case on the NetworkManager backend ever since (see PR#278 for background information on this slight change in behavior).

Changelog Bug fixes:

Colin King: Integer shift gotcha

Pre, 10/02/2023 - 12:27md

 Left shifting values seems simple, but the following code contains a bug:

  The literal value 1 is actually a signed int, so the promotion to a uint64_t type will sign extend the value before assigning to x, causing x to be 0xffffffff80000000. The fix is simple, just make the literal value 1 an unsigned int using 1U instead of 1. 
  

Julian Andres Klode: Ubuntu 2022v1 secure boot key rotation and friends

Mër, 01/02/2023 - 2:40md

This is the story of the currently progressing changes to secure boot on Ubuntu and the history of how we got to where we are.

taking a step back: how does secure boot on Ubuntu work?

Booting on Ubuntu involves three components after the firmware:

  1. shim
  2. grub
  3. linux

Each of these is a PE binary signed with a key. The shim is signed by Microsoft’s 3rd party key and embeds a self-signed Canonical CA certificate, and optionally a vendor dbx (a list of revoked certificates or binaries). grub and linux (and fwupd) are then signed by a certificate issued by that CA

In Ubuntu’s case, the CA certificate is sharded: Multiple people each have a part of the key and they need to meet to be able to combine it and sign things, such as new code signing certificates.

BootHole

When BootHole happened in 2020, travel was suspended and we hence could not rotate to a new signing certificate. So when it came to updating our shim for the CVEs, we had to revoke all previously signed kernels, grubs, shims, fwupds by their hashes.

This generated a very large vendor dbx which caused lots of issues as shim exported them to a UEFI variable, and not everyone had enough space for such large variables. Sigh.

We decided we want to rotate our signing key next time.

This was also when upstream added SBAT metadata to shim and grub. This gives a simple versioning scheme for security updates and easy revocation using a simple EFI variable that shim writes to and reads from.

Spring 2022 CVEs

We still were not ready for travel in 2021, but during BootHole we developed the SBAT mechanism, so one could revoke a grub or shim by setting a single EFI variable.

We actually missed rotating the shim this cycle as a new vulnerability was reported immediately after it, and we decided to hold on to it.

2022 key rotation and the fall CVEs

This caused some problems when the 2nd CVE round came, as we did not have a shim with the latest SBAT level, and neither did a lot of others, so we ended up deciding upstream to not bump the shim SBAT requirements just yet. Sigh.

Anyway, in October we were meeting again for the first time at a Canonical sprint, and the shardholders got together and created three new signing keys: 2022v1, 2022v2, and 2022v3. It took us until January before they were installed into the signing service and PPAs setup to sign with them.

We also submitted a shim 15.7 with the old keys revoked which came back at around the same time.

Now we were in a hurry. The 22.04.2 point release was scheduled for around middle of February, and we had nothing signed with the new keys yet, but our new shim which we need for the point release (so the point release media remains bootable after the next round of CVEs), required new keys.

So how do we ensure that users have kernels, grubs, and fwupd signed with the new key before we install the new shim?

upgrade ordering

grub and fwupd are simple cases: For grub, we depend on the new version. We decided to backport grub 2.06 to all releases (which moved focal and bionic up from 2.04), and kept the versioning of the -signed packages the same across all releases, so we were able to simply bump the Depends for grub to specify the new minimum version. For fwupd-efi, we added Breaks.

(Actually, we also had a backport of the CVEs for 2.04 based grub, and we did publish that for 20.04 signed with the old keys before backporting 2.06 to it.)

Kernels are a different story: There are about 60 kernels out there. My initial idea was that we could just add Breaks for all of them. So our meta package linux-image-generic which depends on linux-image-$(uname -r)-generic, we’d simply add Breaks: linux-image-generic (« 5.19.0-31) and then adjust those breaks for each series. This would have been super annoying, but ultimately I figured this would be the safest option. This however caused concern, because it could be that apt decides to remove the kernel metapackage.

I explored checking the kernels at runtime and aborting if we don’t have a trusted kernel in preinst. This ensures that if you try to upgrade shim without having a kernel, it would fail to install. But this ultimately has a couple of issues:

  1. It aborts the entire transaction at that point, so users will be unable to run apt upgrade until they have a recent kernel.
  2. We cannot even guarantee that a kernel would be unpacked first. So even if you got a new kernel, apt/dpkg might attempt to unpack it first and then the preinst would fail because no kernel is present yet.

Ultimately we believed the danger to be too large given that no kernels had yet been released to users. If we had kernels pushed out for 1-2 months already, this would have been a viable choice.

So in the end, I ended up modifying the shim packaging to install both the latest shim and the previous one, and an update-alternatives alternative to select between the two:

In it’s post-installation maintainer script, shim-signed checks whether all kernels with a version greater or equal to the running one are not revoked, and if so, it will setup the latest alternative with priority 100 and the previous with a priority of 50. If one or more of those kernels was signed with a revoked key, it will swap the priorities around, so that the previous version is preferred.

Now this is fairly static, and we do want you to switch to the latest shim eventually, so I also added hooks to the kernel install to trigger the shim-signed postinst script when a new kernel is being installed. It will then update the alternatives based on the current set of kernels, and if it now points to the latest shim, reinstall shim and grub to the ESP.

Ultimately this means that once you install your 2nd non-revoked kernel, or you install a non-revoked kernel and then reconfigure shim or the kernel, you will get the latest shim. When you install your first non-revoked kernel, your currently booted kernel is still revoked, so it’s not upgraded immediately. This has a benefit in that you will most likely have two kernels you can boot without disabling secure boot.

regressions

Of course, the first version I uploaded had still some remaining hardcoded “shimx64” in the scripts and so failed to install on arm64 where “shimaa64” is used. And if that were not enough, I also forgot to include support for gzip compressed kernels there. Sigh, I need better testing infrastructure to be able to easily run arm64 tests as well (I only tested the actual booting there, not the scripts).

shim-signed migrated to the release pocket in lunar fairly quickly, but this caused images to stop working, because the new shim was installed into images, but no kernel was available yet, so we had to demote it to proposed and block migration. Despite all the work done for end users, we need to be careful to roll this out for image building.

another grub update for OOM issues.

We had two grubs to release: First there was the security update for the recent set of CVEs, then there also was an OOM issue for large initrds which was blocking critical OEM work.

We fixed the OOM issue by cherry-picking all 2.12 memory management patches, as well as the red hat patches to the loader we take from there. This ended up a fairly large patch set and I was hesitant to tie the security update to that, so I ended up pushing the security update everywhere first, and then pushed the OOM fixes this week.

With the OOM patches, you should be able to boot initrds of between 400M and 1GB, it also depends on the memory layout of your machine and your screen resolution and background images. So OEM team had success testing 400MB irl, and I tested up to I think it was 1.2GB in qemu, I ran out of FAT space then and stopped going higher :D

other features in this round
  • Intel TDX support in grub and shim
  • Kernels are allocated as CODE now not DATA as per the upstream mm changes, might fix boot on X13s
am I using this yet?

The new signing keys are used in:

  • shim-signed 1.54 on 22.10+, 1.51.3 on 22.04, 1.40.9 on 20.04, 1.37~18.04.13 on 18.04
  • grub2-signed 1.187.2~ or newer (binary packages grub-efi-amd64-signed or grub-efi-arm64-signed), 1.192 on 23.04.
  • fwupd-signed 1.51~ or newer
  • various linux updates. Check apt changelog linux-image-unsigned-$(uname -r) to see if Revoke & rotate to new signing key (LP: #2002812) is mentioned in there to see if it signed with the new key.

If you were able to install shim-signed, your grub and fwupd-efi will have the correct version as that is ensured by packaging. However your shim may still point to the old one. To check which shim will be used by grub-install, you can check the status of the shimx64.efi.signed or (on arm64) shimaa64.efi.signed alternative. The best link needs to point to the file ending in latest:

$ update-alternatives --display shimx64.efi.signed shimx64.efi.signed - auto mode link best version is /usr/lib/shim/shimx64.efi.signed.latest link currently points to /usr/lib/shim/shimx64.efi.signed.latest link shimx64.efi.signed is /usr/lib/shim/shimx64.efi.signed /usr/lib/shim/shimx64.efi.signed.latest - priority 100 /usr/lib/shim/shimx64.efi.signed.previous - priority 50

If it does not, but you have installed a new kernel compatible with the new shim, you can switch immediately to the new shim after rebooting into the kernel by running dpkg-reconfigure shim-signed. You’ll see in the output if the shim was updated, or you can check the output of update-alternatives as you did above after the reconfiguration has finished.

For the out of memory issues in grub, you need grub2-signed 1.187.3~ (same binaries as above).

how do I test this (while it’s in proposed)?
  1. upgrade your kernel to proposed and reboot into that
  2. upgrade your grub-efi-amd64-signed, shim-signed, fwupd-signed to proposed.

If you already upgraded your shim before your kernel, don’t worry:

  1. upgrade your kernel and reboot
  2. run dpkg-reconfigure shim-signed

And you’ll be all good to go.

deep dive: uploading signed boot assets to Ubuntu

For each signed boot asset, we build one version in the latest stable release and the development release. We then binary copy the built binaries from the latest stable release to older stable releases. This process ensures two things: We know the next stable release is able to build the assets and we also minimize the number of signed assets.

OK, I lied. For shim, we actually do not build in the development release but copy the binaries upward from the latest stable, as each shim needs to go through external signing.

The entire workflow looks something like this:

  1. Upload the unsigned package to one of the following “build” PPAs:

  2. Upload the signed package to the same PPA

  3. For stable release uploads:

    • Copy the unsigned package back across all stable releases in the PPA
    • Upload the signed package for stable releases to the same PPA with ~<release>.1 appended to the version
  4. Submit a request to canonical-signing-jobs to sign the uploads.

    The signing job helper copies the binary -unsigned packages to the primary-2022v1 PPA where they are signed, creating a signing tarball, then it copies the source package for the -signed package to the same PPA which then downloads the signing tarball during build and places the signed assets into the -signed deb.

    Resulting binaries will be placed into the proposed PPA: https://launchpad.net/~ubuntu-uefi-team/+archive/ubuntu/proposed

  5. Review the binaries themselves

  6. Unembargo and binary copy the binaries from the proposed PPA to the proposed-public PPA: https://launchpad.net/~ubuntu-uefi-team/+archive/ubuntu/proposed-public.

    This step is not strictly necessary, but it enables tools like sru-review to work, as they cannot access the packages from the normal private “proposed” PPA.

  7. Binary copy from proposed-public to the proposed queue(s) in the primary archive

Lots of steps!

WIP

As of writing, only the grub updates have been released, other updates are still being verified in proposed. An update for fwupd in bionic will be issued at a later point, removing the EFI bits from the fwupd 1.2 packaging and using the separate fwupd-efi project instead like later release series.

Stuart Langridge: Ronin

Hën, 30/01/2023 - 4:44md

In 1701, Asano Naganori, a feudal lord in Japan, was summoned to the shogun’s court in Edo, the town now called Tokyo. He was a provincial chieftain, and knew little about court etiquette, and the etiquette master of the court, Kira Kozuke-no-Suke, took offence. It’s not exactly clear why; it’s suggested that Asano didn’t bribe Kira sufficiently or at all, or that Kira felt that Asano should have shown more deference. Whatever the reasoning, Kira ridiculed Asano in the shogun’s presence, and Asano defended his honour by attacking Kira with a dagger.

Baring steel in the shogun’s castle was a grievous offence, and the shogun commanded Asano to atone through suicide. Asano obeyed, faithful to his overlord. The shogun further commanded that Asano’s retainers, over 300 samurai, were to be dispossessed and made leaderless, and forbade those retainers from taking revenge on Kira so as to prevent an escalating cycle of bloodshed. The leader of those samurai offered to divide Asano’s wealth between all of them, but this was a test. Those who took him up on the offer were paid and told to leave. Forty-seven refused this offer, knowing it to be honourless, and those remaining 47 reported to the shogun that they disavowed any loyalty to their dead lord. The shogun made them rōnin, masterless samurai, and required that they disperse. Before they did, they swore a secret oath among themselves that one day they would return and avenge their master. Then each went their separate ways. These 47 rōnin immersed themselves into the population, seemingly forgoing any desire for revenge, and acting without honour to indicate that they no longer followed their code. The shogun sent spies to monitor the actions of the rōnin, to ensure that their unworthy behaviour was not a trick, but their dishonour continued for a month, two, three. For a year and a half each acted dissolutely, appallingly; drunkards and criminals all, as their swords went to rust and their reputations the same.

A year and a half later, the forty-seven rōnin gathered together again. They subdued or killed and wounded Kira’s guards, they found a secret passage hidden behind a scroll, and in the hidden courtyard they found Kira and demanded that he die by suicide to satisfy their lord’s honour. When the etiquette master refused, the rōnin cut off Kira’s head and laid it on Asano’s grave. Then they came to the shogun, surrounded by a public in awe of their actions, and confessed. The shogun considered having them executed as criminals but instead required that they too die by suicide, and the rōnin obeyed. They were buried, all except one who was not present and who lived on, in front of the tomb of their master. The tombs are a place to be visited even today, and the story of the 47 rōnin is a famous one both inside and outside Japan.

You might think: why have I been told this story? Well, there were 47 of them. 47 is a good number. It’s the atomic number of silver, which is interesting stuff; the most electrically conductive metal. (During World War II, the Manhattan Project couldn’t get enough copper for the miles of wiring they needed because it was going elsewhere for the war effort, so they took all the silver out of Fort Knox and melted it down to make wire instead.) It’s strictly non-palindromic, which means that it’s not only not a palindrome, it remains not a palindrome in any base smaller than itself. And it’s how old I am today.

Yes! It’s my birthday! Hooray!

I have had a good birthday this year. The family and I had delightful Greek dinner at Mythos in the Arcadian, and then yesterday a bunch of us went to the pub and had an absolute whale of an afternoon and evening, during which I became heartily intoxicated and wore a bag on my head like Lord Farrow, among other things. And I got a picture of the Solvay Conference from Bruce.

This year is shaping up well; I have some interesting projects coming up, including one will-be-public thing that I’ve been working on and which I’ll be revealing more about in due course, a much-delayed family thing is very near its end (finally!), and in general it’s just gotta be better than the ongoing car crash that the last few years have been. Fingers crossed; ask me again in twelve months, anyway. I’ve been writing these little posts for 21 years now (last year has more links) and there have been ups and downs, but this year I feel quite hopeful about the future for the first time in a while. This is good news. Happy birthday, me.