Some platforms have really weird patching systems. If you do something like frameshift all your data by 1 byte in an update, the patch engine will fall apart and it will be 100% the same size as the new file. And there are limits to how large patches can be, too. This sounds like it might be amateurish, but there's reasons for it: making a fully-powered patching system has performance implications. Specifically on the disk IO system. And consoles are all about the bargain of the platform guaranteeing resources while the developer guarantees he'll use them. So the patching apparatus cannot be infinitely-powered, like to the extent where it's a fully optimal difference which requires unbounded time to uncrunch patch data while referencing sectors from all over the version0 datafile. Specifically that last part: in the worst case scenario: the individual seek and IO time will get out of control, as will the wear and tear on the flash media. I call it read amplification because that's basically what it is.
There's a piece of the puzzle you're missing if there was something creepy in the above that was giving you a bad puzzled feeling. And it's this: the premise for a console platform's patching engine is that the version0 datafile is always present. It's in the physical media! The patch must be applied on the fly. There's no separate storage for it. Even on modern consoles, this same principle may be in effect.
On the developer's side of that guarantee, then: we are supposed to read the docs about how the platform's patching apparatus works, and then design disk file formats which don't cause problems for it. This always catches Unity developers by surprise--the default way for doing things there makes a complete mess out of the data every time you touch anything. It's always sad to see the horror slowly dawn on them. Well, it catches everyone by surprise at least a little the first time, but at least the developer of a proprietary engine would have had this kind of thing on their radar already (and it's not as likely to be their first time doing anything.)
But I'm personally in a bit of a pickle. See, I'm porting other people's engines, and they've unknowingly made poor choices about the structure of the packfiles. I don't want to hack the engine (part of my business plan is to do everything I can to touch engines and games as little as possible, to keep costs low.) But what do I do when I need to patch one of these games? I ran into this scenario last week. So here's how I handled it.
I have a file format which consists of an 8 byte record per 'sector', where the 'sector' size can be tuned somewhat depending on the nature of the data. This sector consists of what I call a 'literal prefix', a 'chunk', and a 'literal suffix'. Tracking that requires 8 bytes per sector which can be directly-indexed by sector. The literal prefix and suffix are stored in the patch file at the end together. The 'chunk' comes from the version0 datafile. The sector may naturally be 100% literal prefix or 100% chunk--and commonly is. After all, we're dealing with packfiles, and unless internal data has been re-indexed somehow so a bunch of IDs and references are changed (we'll hope that doesn't happen *), most of the data is the SAME--only moved around. And most importantly, the implications on read amplification are contained: reading one sector of data from the patched file results in reading no more than two sectors of data from the version0 file.
I made an effort to find the best 'chunk' that I possibly could. I do exhaustively search for those, using an acceleration data structure that consists of a list of offsets of strings in the version0 file beginning with each 16bit value. So that actually cuts the number of strings in the version0 file I have to search by 1/65536th. Using that filter, I just look for the longest matching string I can. Even this would be excruciatingly slow (hours?) but I found another technique: It so happens that the best string in version0 for version1's sector 124 is most commonly the string immediately succeeding version1's sector 123 chunk (as long as the literal suffix for sector 123 was empty--the sector is concluded with a chunk). That is, in other words, I can find the next piece of data from the packfile's inner-packed file without even needing to know anything about the structure of the packfile. With THAT (and a 32 thread processor) I can handle a 1GB packfile in 30 seconds or so. The whole algorithm is trivially parallelizable. Even the acceleration table is parallelizable (each of 256 threads indexes the strings starting with that byte). That table is about 10GB for a 1GB packfile, when it's all said and done.
I did evaluate all the existing patch file formats, but none of them were really designed for on-the-fly use. For sensible reasons. I'm sure in exotic filesystems there may have been something like I needed but it would have been impossible to reuse that stuff. I didn't even look for it.
Now to finally complete the picture, what we do is make version1 of the game out of the version0 packfile + the patch file. The patch that gets shipped to users is no larger than our patch file, then.
Actually the format I devised is kind of bloated, so I do compress the whole thing. Could compress to anywhere from 10% to 99% of the original file size.. depends on details. Since this is done on the entire file, and you may have noticed my read-amplification analysis didn't discuss reading the literal parts, it follows that my patch files are loaded completely into memory. My defense of this is that any engines we have that make such good use of a packfile that they are enormous enough to cause problems, will also be rather efficient about NOT loading too much data in memory, meaning I will have plenty of room to hold a patchfile. Hope I'm not wrong...
I had to modify my engine to try opening myfile.ext.patch any time myfile.ext is opened (actually this analysis is done when our filesystem is built during content building) and then have the ability to do this patching on the fly. But there's one more read-amplification issue I had to deal with: since my patch file is sector-based, and usually games will read entire multi-sector files from the packfile all in one operation, I don't want to turn that into several individual-sector reads from the physical filesystem. To make sure this doesn't happen, I interposed another layer of logic that 'coalesces' adjacent reads from the version0 file and doesn't execute them until a read comes from somewhere else.
Whew, that was a lot of work! I hope it works out. Part of the concept here is that no work is required IN ADVANCE, so despite my testing, I won't be 100% sure about it until I actually run into the scenario where I have to do the patch (this whole endeavour was precautionary before we shipped a game with huge packfiles).
- In case this does happen, I have another strategy planned. What I'll do is find the best-matching sector's worth of data in the version0 file and then store the XOR of it vs version1 in the patch. The whole patch will then be compressed (and it's hopefully mostly zeroes) but it needs to be done one sector at a time, since the mountain of zeroes will be too big for RAM. I think this approach has promise, but it's just not what I ended up doing. Plus, I think the computation to find "best-matching sector" is pretty rough. I don't want a patch file that takes days to compute... unless I'm really, REALLY desperate. Since I have the on-the-fly patching built into my engine now, I could make an alternate patch format without too much work. Note that in order to avoid read-amplificaton issues, I would need to STRONGLY prefer sequences of sectors. I mean, I know it's paranoid, but I'd rather miss once and have an inefficient sector than even have a chance of a perverse case turning one long sequential read into dozens of short ones.