Testing out snapshots in Apple’s next-generation APFS file system

Testing out snapshots in Apple’s next-generation APFS file system

We brave beta software and do some cautious testing—and it looks like it works.

Written by Adam H. Leventhal / Courtesy of ArsTechnica

Back in June, Apple announced its new upcoming file system: APFS, or Apple File System. There was no mention of it in the WWDC keynote, but devotees needed no encouragement. They picked over every scintilla of data from the documentation on Apple’s developer site, extrapolating, interpolating, eager for whatever was about to come. In the WWDC session hall, the crowd buzzed with a nervous energy, eager for the grand unveiling of APFS. I myself badge-swapped my way into the conference just to get that first glimpse of Apple’s first original filesystem in the 30+ years since HFS.

Apple’s presentation didn’t disappoint the hungry crowd. We hoped for a modern filesystem, optimized for next generation hardware, rich with features that have become the norm for data centers and professionals. With APFS, Apple showed a path to meeting those expectations. Dominic Giampaolo and Eric Tamura, leaders of the APFS team, shared performance optimizations, data integrity design, volume management, efficient storage of copied data, and snapshots—arguably the feature of APFS most directly in the user’s control.

Far from vaporware, Apple made APFS available to registered developers that day. The company included it in macOS Sierra as a technology preview. You can play with APFS today and a lot of the features are there. You can use space sharing to carve up a single disk into multiple volumes. You can see the speed of its directory size calculation—nearly instantaneous—compared with the slow process on HFS+. You can use clones to make constant-time copies of files or directories. At WWDC, Apple demonstrated the feature folks were the most eager to play with: snapshots. Tamura used snapshotUtil to create, list, and mount snapshots. But early adopters quickly discovered that snapshotUtil wasn’t part of the APFS technology preview.

Apple promised delivery in 2017. We all double-checked our HFS backups and waited.

A brand new day

It’s 2017, and Apple already appears to be making good on its promise with the revelation that the forthcoming iOS 10.3 will use APFS. The number of APFS tinkerers using it for their personal data has instantly gone from a few hundred to a few million. Beta users of iOS 10.3 have already made the switch apparently without incident. They have even ascribed unscientifically-significant performance improvements to APFS.

With APFS taking the next step, I decided to check back in on snapshots. There had been no news from Apple and nothing obviously new in macOS updates, but back in June I wrote about a clue Apple had left in macOS Sierra:

I used DTrace (technology I’m increasingly amazed that Apple ported from OpenSolaris) to find a tantalizingly named new system call fs_snapshot; I’ll leave it to others to reverse engineer its proper use.

With its proper use still, apparently, a mystery, and APFS freshly of interest, I dove back in.

The game is afoot

First a little background. An operating system roughly divides the world into the kernel and user processes. The kernel can, for the most part, do anything. It can talk to hardware devices; it can access all memory; it can execute privileged instructions. In short, it has unfettered access.

The kernel provides abstractions and imposes security for regular user processes. Have you ever seen ‘kernel_task’ in Activity Monitor? That’s the kernel using CPU, memory, or other resources. User programs are everything else: applications you run, the Finder, the windowing system, even the Dock or other pieces that modern parlance includes as part of the “operating system.”

A system call is simply a way for a user process to communicate with the kernel. If a program wants to write data to disk or get a larger memory allocation, it needs the kernel to verify permissions and execute those tasks; the system call is the mechanism that the user process uses. Note that the root user (or “sudo”) still relates to user processes, just ones that the kernel imbues with greater privileges.

I used DTrace to find the system call. DTrace is the dynamic tracing facility I co-authored at Sun with Bryan Cantrill and Mike Shapiro. It provides visibility into the whole system, from the kernel and device I/O to Java or Swift function calls. Naturally, DTrace includes visibility into system calls. Apple ported DTrace from Solaris in 2006; a typical Mac has hundreds of thousands of probes, discrete points of instrumentation; we can list them with dtrace -l:

(Note that some parts of DTrace are protected by SIP and need to be disabled before you can use them!)

I found the system call of interest by looking through DTrace system-call probes

DTrace is an incredibly powerful tool for understanding how a system is behaving. Here, however, we’re just taking advantage of how DTrace can show us a definitive list of system calls. We can also see the fs_snapshot system call in the file /usr/include/sys/syscall.h (you’ll need the Xcode developer tools installed to do this)

It’s a little more straightforward, but less definitive since there’s no guarantee that code in a header file matches the running kernel.

A simple Google search for fs_snapshot immediately pointed me in the right direction, turning up a file in XNU on Apple’s open source website. XNU is the macOS kernel that came over from NeXT. Run uname -v and you’ll see the specific XNU version that your computer is running. For well over a decade, Apple has made XNU available as open source (and has done the same for many other macOS components). For a company known for its secrecy, it’s commendable that Apple has built such a tradition of transparency with at least some subset of their software. Commendable and quite the boon for anyone trying to enable an unpublished feature!

The first snapshot

Learning from XNU and making some educated guesses, I wrote my first C program to create an APFS snapshot. This section has a bit of code, which you can find in this Github repo.

Watch this space

Snapshot are going to be a powerful feature of APFS. Beyond creating snapshots, mounting them, and reverting volumes to earlier snapshots, they have the potential to form the basis for an efficient and robust backup system. Apple (or a third party!) could arrange for snapshots to be taken periodically and then backup files changed between snapshots when a backup device is available. That could be a disk in your house or a cloud service from Apple, Dropbox, Google, or someone else. With its unknown utilities and unpublished APIs, Apple has already enabled a whole new collection of backup tools.

The next APFS capability I’m hoping for is the ability to package and send the changes between snapshots or a related ability to identify files that were changed between when two snapshots were taken. Either would be a huge boon for backup and data integrity.

But who knows—maybe Apple’s already shipped it and it’s just waiting to be discovered.

Read the entire original article over at ArsTechnica.com.