TL;DR
I had the honor of recording not one, not two, but THREE lightboard videos for Pure Storage! And they’re up on YouTube for you to enjoy!
- How Volumes work on FlashArray
- How do snapshots make your life easier?
- Rapid recovery using FlashBlade
For today’s blog post, I want to focus on the subject matter of the first Lightboard video: How Volumes work on FlashArray.
Storage is Simple, Right?
For my many years as a SQL Server database developer and administrator, I always thought rather simplistically of storage. I had a working knowledge of how spinning media worked, and basic SAN & RAID architecture knowledge from a high level. And then flash media came along and I recall learning about its differences and nuances.
But fundamentally, storage still remained a simplistic matter in my mind – it was the physical location to write your data. Frankly, I never thought about how storage and a SAN could offer much much more than simply that.
A Legacy of Spinning Platters
Many of us, myself included, grew up with spinning platters as our primary storage media. Over the early years, engineers have come up with a variety of creative ways to squeeze out better performance. One progression was to move from one single disk to many disks working together collectively in a SAN. That enabled us to stripe or “parallelize” a given workload across many disks rather than just be stuck with the physical performance constraints of a single disk.
Carve It Up
In the above simplified example, we have a SAN with 16 disks. And let’s say that each disk gives us 1,000 IOPs @ 4kb. I have a SQL Server whose workload needs 4,000 IOPs for my data files and 6,000 IOPs for my transaction log. So I would have to create two volumes containing the appropriate number of disks from the SAN to give me the performance characteristics that I require for my workload. (Remember, this is a SIMPLIFIED diagram to illustrate the general point)
Now imagine being a SAN admin trying to have to juggle hundreds of volumes across dozens of attached servers, each with their own performance demands. Not only is that a huge challenge to keep organized, but it’s highly unlikely that every server will have their performance demands met, given the finite number of disks available. What a headache, right?
But what if we were no longer limited by the constraints presented by spinning platters? Can we approach this differently?
Letting Go Of What We Once Knew
One thing that can be a challenge for many technologists, myself especially, is letting go of old practices. Oftentimes those practices were learned a very hard way, so we want to make sure we never have go through whatever rough times again. Even when we’re presented with new technology, we often just stick to the “tried and true” way of doing certain things, because we know it works.
One of things “tried and true” things we can revisit with Pure Storage and FlashArray is the headache of carving up a SAN to get specific performance characteristics for our volumes. When Pure Storage first came to be, they focused solely on all-flash storage. As such, they were not tied to legacy spinning disk paradigms and could dream up new ways of doing things that suited flash storage media.
Abstractions For The Win
On FlashArray, a volume is not a subset or physical allocation of storage media assigned to it. Instead, a volume on FlashArray is just a collection of pointers to wherever the data wound up being landed.
Silly analogy: pretend you’re boarding a plane. On a traditional airline, typically first class boards first and goes to first class, then premium economy passengers go board to their section, then regular economy boards and go to their section, and basic economy finally boards and goes to theirs. But if you were on Southwest Airlines, you can choose your own seat. So you’d board, and simply go wherever you wish (and pretend you report back that you’ve taken a particular seat to an employee). Legacy storage is like that traditional airline where you (data) were limited to sit down in to your respective seat class, because that’s how the airplane was pre-allocated. But on FlashArray, you’re not limited in that way and can simply sit where you like, because you (data) have access to sit anywhere.
Another way of describing it that might resonate is that legacy storage assigned disk storage to a volume and whatever data landed on that volume landed on the corresponding assigned disk. On FlashArray, the data can be landed anywhere on the entire array, and the volume that the data was written to simply stores a pointer to wherever the data wound up on the array.
Fundamental Changes Make a Difference
This key fundamental change in how FlashArray stores your data, opens up a huge realm of other interesting capabilities that were either not possible or much more difficult to accomplish on spinning platters. I like to describe it as software-enhanced storage, because there’s many things we’re doing besides just “writing your data to disk” on the software layer. In fact, we’re not quite writing your raw data to disk… there’s an element of “pre-processing” that takes place. But that’s another blog for another day.
Take 3 Minutes for a Good Laugh
If you want to watch me draw some diagrams on a lightboard that illustrate all of this, then please go watch How Volumes work on FlashArray. It’s only a few minutes long and listening to me on 2x is quite entertaining in of itself. Just be sure to re-watch it to actually listen to the content, because I’m guarantee you’ll be laughing your ass at me chattering at 2x speed. 🙂
Thanks for reading!