Like they say at Reddit – tl;dr : HourGlass will get a better undo/version history.
The undo system for the envelopes in HourGlass has worked in a pretty odd way. All the envelopes (by default over 1000 envelopes) were stored for each undo history entry as arrays of bytes, even if an envelope had not changed at all compared to its state in the previous undo history entry. Now, this obviously sounds like a ton of data would be generated, most of it completely redundant, and no surprise really, that’s exactly how it works. A quite zany optimization was thus introduced : the multiple byte arrays for the envelopes were joined into a single byte array which was then compressed using the zip compression algorithm. Yes, the same algorithm that is used for .zip files : http://en.wikipedia.org/wiki/Zip_(file_format) This in fact worked somewhat well to reduce the space needed for each undo step and even the CPU cost of zipping and unzipping the undo states wasn’t really that bad. (In reality the situation was a bit more complicated since map containers were also involved to associate the byte arrays with the parameters from which they originated, but the general concept is that the undo history was a dynamic array of dynamic arrays consisting of zip compressed data.)
The way the undo system worked has nagged me quite a lot over the years and I recently finally decided to re-implement it. I have investigated copy-on-write techniques quite a lot for the past year and decided to use that for the HourGlass envelopes undo history. Or rather, maybe really “version history” from now on. I don’t think the ability to record changes to an application’s state and getting them back should necessarily carry the rather sinister name “undo” as also creative/productive uses can be found. (For A/B/C/D… comparisons and so on. One could perhaps argue that such things should be implemented as separate features, but I happen to disagree with that somewhat.)
Copy-on-write means a way of managing data so that making actual copies is delayed until a copy of the data actually needs to change.
For example, an operating system can implement a scheme where it has a single page of memory that only consists of zeros which it can hand out to processes when they initially request memory to be allocated. (It improves security to give processes memory that is filled with zeros instead of some left-over contents from other processes. This unfortunately doesn’t mean that in programming languages like C or C++ things will be automatically zero initialized, but that is another issue.) Now, this zero page can be implemented so that it is initially shared among all the processes that have requested memory. Once a process wants to actually write into the memory it has allocated, the operating system switches the shared zero page to an actual reserved memory area. (Because the operating system is all powerful in the computer, this can all happen “under the feet” of the user processes.) While this sounds a bit complicated and useless at first, it might not be that uncommon that a process never actually writes into all the memory it has requested and thus making actual memory allocations could be avoided. In this case, copy-on-write has achieved a security improvement and a potential physical memory use optimization. (The “copy” in this scheme is the zeros that the process will initially see regardless of if it has yet written to the memory or not. If it just reads the memory, it sees the shared zeros. If it does a write operation, it will now see zeros in a new actually reserved memory area.)
The previous explanation is about a rather low level thing in the operating system but user level applications can also leverage the same technique. The traditional use case (in C++) has been to optimize string objects. (edit : code snippet put to pastebin.com)
One might wonder in the previous examples how can the first = assignment not make a real copy, yet the second will cause something different. The short answer is “magic”. The long one is that different overloads of C++ copy constructors and copy assignment operators have been invoked here by the compiler and those different overloads either just increase the data buffer reference count or “detach” the data buffer. If this sounds like the “wrong” overload might be picked sometimes and copies be created when not really needed, that indeed is the case if the programmer isn’t quite careful. (Mostly in observing const-correctness.) We can see that implementing a copy-on-write string class with an API that is natural to use and really does copies just on writes has problems. But wait, it gets worse…
The reference count to the shared data in a copy-on-write data structure can’t realistically these days be just a normal integer variable. The reason for that is that code will now very often run in a multithreaded context. The reference counting must be implemented in a way that is safe for multiple threads to use concurrently. Even if some particular piece of code currently doesn’t even use multiple threads, the reference counting should still support multiple threads properly. (Because maybe in the future the code will actually run in multiple threads and if someone forgot the reference counting in some piece of library code wasn’t thread-safe, only disaster and hilarity can ensue…) The most commonly used solution for the reference counter is to use an atomic variable.
Copy-on-write is not a popular thing for string classes now, the greatest reason being the atomic reference counting inside. The atomic reference counter can in some scenarios cause multiple threads to just wait on each other to synchronize, which can be a huge performance problem on multiple CPU systems. Saving on memory use and ensuring copying the strings is fast has therefore caused using multiple threads to be in vain in order to increase application throughput. It would have been better if each thread just used its own full copy of the string involved. I am myself also of the opinion that copy-on-write for strings probably isn’t a good idea anymore. However, Qt’s QString and gcc’s C++ standard library std::string still use it, although the problems the copy-on-write behavior was originally intended to solve don’t largely exist anymore. (Due to move semantics in C++11 and compilers that have improved to optimize copies away in the most obvious cases.)
So, getting back to HourGlass’s undo history…If copy-on-write has such suspicious properties in terms of it actually managing to avoid copies and memory allocations, and causing problems in multi-threaded environments, why would I choose to use the technique? Firstly, the data to be handled in HourGlass isn’t strings. Secondly, I have been using a copy-on-write wrapper class that requires on use of the contained object to explicitly determine if a read or write is going to happen. This removed the ambiguities with the copy constructors and copy assignment operators, as well as usually enforces const correctness. There is some added noise in the code due to the read() and write() calls, but it’s a trade-off I am willing to live with.
Usage is like (edit : code snippet put to pastebin.com because the WordPress code formatting doesn’t like C++ templates…) :
Almost all use of the envelopes in HourGlass was changed to be via the read() and write() methods. This took me a few hours and wasn’t exactly “fun”, but I think the results are worth it. The undo (or version) history for the envelopes is now a lot cleaner, no tricks with byte arrays and zip compression are needed and possibilities for various nice tricks now exist. For example, the state of just a single envelope could be easily and cleanly retrieved back from the history. Memory usage of the history entries has dropped to at least 1/3 of the old system. Adding more state besides the automation envelopes, like effects settings, has become a more plausible option too. I am somewhat on the fence about the effects settings though. I haven’t always found it useful for example in Cockos Reaper that all tweaks in plugins (or the mixer) go into its undo history. However, the way the state versions are now handled in HourGlass, it should be fairly trivial to implement it so that things are always stored, but can be optionally recalled from the history. Unneeded states can also be removed from the history by user action.
The problem with the atomic reference counters is mostly negated by HourGlass having to synchronize access to the objects anyway via mutex locking because I allow changing the states during audio playback. (That is, whatever performance decrease there exists due to thread synchronization, already would exist in any case.) It could be argued that the mutex locking isn’t appropriate to begin with in a realtime audio playback context, but so far I really can’t say I would have seen any really ill effects out of it, barring some obvious programming errors. I have been on the side searching and fixing issues in the code where the mutex locks could be held too long and replaced them with much faster code that has predictable characteristics.
Thanks to Sean Parent for his talks, materials and emails on copy-on-write (and on other stuff). Apologies for any errors, omissions and inconsistencies.