By Bill Piper, VP of Hardware Engineering, Wells Fargo
What is big data?Big data may be one of the hottest IT industry terms today despite the lack of consensus on its exact meaning. The word data is relatively straight-forward; big is somewhat subjective. To complicate the issue further, some believe there is more to big data than just being big, that it refers to software specifically designed to manage and analyze large data sets. Others view big data as the next generation of business intelligence/analytics, while others would argue that big data is not traditional business intelligence but a more exploratory approach. Personally, I view all of these as reasonable definitions of big data, and define big data as all types of large, rapidly-growing, structured, and unstructured data. “Over the last twenty years, nearly all organizations transitioned to centralized or shared storage technologies to deal with rapidly growing storage capacities and workloads” Rapid Data Growth While the term big data has become popular in recent years, the trend of data growing rapidly began at the dawn of the digital age. The hard disk drive was invented in the 1950s, with capacity measured in single digit megabytes. These original disk drives leveraged platters over twenty inches. Today we have 8 terabyte disk drives in a 3.5 inch form factor. This represents an improvement of over a million times capacity in the last sixty years, and does not take into account the significant reduction in size of the devices. To put this into perspective, the amount of storage in the common cell phone today is larger than an entire room of disk drives just a couple decades ago. Data growth is the primary driver behind the innovation we have seen in the storage technology space. Outpacing Moore’s law (doubling every 24 months) is not a challenge for the light hearted. Transformation of Storage Driven By Big Data Over the last twenty years, nearly all organizations transitioned to centralized or shared storage technologies to deal with rapidly growing storage capacities and workloads. From an organizational perspective, this led to the creation of storage teams and/or departments. By physically separating compute and storage resources utilization, their capacity (space and performance) management functions become independent. This independence has a profound impact on our day-to-day ability to run an optimized infrastructure. Physical servers are not required to increase storage capacity and physical storage devices are not required when increasing compute power.