Teaching Exercises
From Dan Shearer CV
These are some exercises and tricks I have either created or been subjected to over the years, and I have mentored students through them on many occasions.
Security
- Point Whonix at a server we control and try to de-anonymise a web page access using network capture and analysis. Compare with doing the same from a consumer operating system instead of Whonix.
- Construct a single-purpose computer in an embedded application, such as firefox/chromium in kiosk mode on a laptop running Ubuntu Linux. Then destabilise the computer using all attacks such as network, physical, software, sidechannel and social engineering.
- The goal of security researchers (regardless of hat colour) is often to get control of userspace. For many years there have been Linux Play Machines online with root password published and full shell access to userspace to anonymous users from anywhere. Yet these machines are considered secure. What does this say about security generally? Build and attack such machines.
Complexity and Tech Robustness
- Follow instructions to install a common 2021 web stack from its component parts on a fresh virtual machine: Vue.js+Node.js+Apache+language server+SQL database. Say "hello world", and do basic reliability testing. Introduce small but plausible changes in the stack components to check they make an observable difference.
- Travel back to 1975 by booting IBM MVS 3.8 in Hercules If instructions are followed exactly it doesn't take long to get a working system (Strong Hint! Follow the instructions, because your computing experience is probably irrelevant.) This takes about the same amount of developer time as Vue.js in the previous challenge to get to "hello world". Introduce small but plausible changes. Which stack is most likely to be working in ten years?
- Follow instructions to connect Vue.js "Hello" to use MVS as a database. This is a ridiculous stack. Compare the stack levels and their fragility to a typical distributed microservice architecture with 7 levels of language involved. Which is most likely to be working in one year? Never mind the lines of code, just think about the number of translation layers. Which stack is the most ridiculous?
- Consider the modern computer and operating system of your choice printing "hello" from local storage. Using public information, estimate the number of lines of code in every element in the stack down to the CPU transistor level. Now apply common bug metrics to this result, and human factors engineering. How many people with which skills would be needed to fix any problem? Does it matter?
- Consider the transistor-up stack; which components publish source? Compare AMD to RISC-V at the bottom; does this cover all the code running on silicon? 30 billion transistors on a modern chip equates to many millions of lines of RTL, which is generated from much fewer millions of lines of VHDL/Verilog, which is often generated from even fewer lines of a high level design language. Can we deduce anything about 3D chip designs with trillions of transistors and AI-assisted design tools, and single source of supply in Belgium for machines to make 3D chips?
- This Reddit "Ask Me Anything" with the SpaceX developers gives some details of the rocket flight software. Draw an architecture diagram of the relevant stacks. What can we decide about complexity and reliability? Are these good choices? Can we conclude this is reliable software?
Operating System Technology
- Linux from Scratch takes a few hours to get a prompt running from the bare components (a full system takes much longer.) Use checksumming to compare the binaries created by different students' Linux from Scratch. Why aren't they all the same? Debian partly solved this in October 2021 after 20 years while even NetBSD, a source distribution unlike Debian, still struggles. Does this kind of reproducibility matter?
- The Linux kernel source is a little under 30 million lines of code. Compile the smallest useful kernel you can, and estimate the number of lines of code used. Is Linux bloated? Compile a kernel on Ubuntu and estimate the number of lines of code. How much of this is running at boot time? Is Ubuntu bloated?
- Modify an operating system so that so that any time the user types "hocus pocus" in any context a log message is sent to a log server over the internet. Are there any limitations on your implementation?
- Modify an operating system to respond to a single network packet of a specified type. What would be good starting points for this?
- How many files are there in the smallest useful Linux deployment?
Software Development
- I claim that among the hardest software engineering tasks is reliable progress bar estimates. Prove me wrong by implementing a progress bar that meets user expectations and handles the changing environment within a computer and from the outside world. Hint: what are the user's expectations? What are progress bars imagined to be communicating?
- Write an internet web application using Node.js to display all the information it can deduce about its network connect (where it is geographically and when, what standards are supported, etc). Explain how you can be sure this application will still run reliably in ten years time and what the limits on this are. Repeat using C or Rust.