Thursday, March 06, 2014 · Posted by Guest Post by Ron Herardian at 9:05 AM
For the last three OCP Summits, engineers with a passion for open source technologies and hardware have come together for a 24-hour hackathon where they work nonstop in a competition to create the best hack. This year, three teams won the hackathon, and two of them will share their experiences on the OCP blog. Here’s the first blog post from Ron Herardian, who provides more detail about his team’s winning project.
Whenever new pieces of technology are introduced, the first question most engineers ask is how they can be used to make things run better and more efficiently. At the Open Compute Summit’s hardware hackathon, we got the chance to answer that question and were thrilled by what we found.
The team I worked with was a diverse group representing a range of companies, and for the most part, we had never met before. Our team was comprised of Andreas Olofsson from Adapteva, Inc., Peter Mooshammer, formerly with IBM, Jon Ehlen from Facebook, and Dimitar Boyn from I/O Switch Technologies, Inc. The team also included Rob Markovic, an independent consultant, and myself, a computer hobbyist and hacker. Although Rob and I were both acquainted with Dimitar from I/O Switch – but not with one another – none of the other team members had previously met and no plans existed prior to the event. Nonetheless, there was immediate synergy and an ambitious plan took shape during an hour of brainstorming.
We decided on a project that we called Adaptive Storage, in which compute and storage resources would be loosely coupled over a network and would scale independently to optimize big data platforms like Hadoop. The project involved creating Hadoop data nodes using Advanced Reduced Instruction Set Computing Machine (ARM) processor-based micro servers and network-connected disk drives. I/O Switch provided a printed circuit board (PCB) that allowed disk drives to be cabled directly to network switches. Hadoop micro server nodes would control one or more disk drives over the network but any micro server could read any disk drive. This would make it possible to dynamically recombine compute and storage resources on a common network switched fabric in a completely flexible way. If it worked, Adaptive Storage could be used to eliminate compute hotspots and coldspots in Hadoop.
Right from the start, the whole team was fascinated by the possibilities of the new Parallella micro server for cloud service providers and large enterprises. Although it is ideal for the hobbyist and education markets, Parallella is a powerful, flexible and extendable computing platform. The Parallella computer has a dual core Zynq Z-7020 ARM A9 processor together with a 16-core Epiphany Multicore Accelerator and one gigabyte of random access memory. It also has built-in gigabit Ethernet, Universal Serial Bus (USB) and High-Definition Multimedia Interface (HDMI) as well as 50 gigabits per second expansion connectors.
The Adaptive Storage concept was developed by Dimitar and Andreas with valuable input from other team members. The project required loosely coupling distributed Parallella compute capacity to storage resources over a network. This involved connecting disk drives to a network using Advanced Technology Attachment (ATA) over Ethernet (AoE) and running open source AoE drivers on Parallella Hadoop data nodes. Adapteva provided the Parallella hardware and Linux distribution while I/O Switch provided AoE to Serial ATA (SATA) PCBs (“AoE Enabler”) and other hardware to build the test lab environment.
The hack required building a custom Linux kernel and compiling open source driver code, and each member of the team quickly focused in on their areas of expertise. Andreas’s hands-on knowledge of the Parallella platform and of Linaro Linux on the ARM processor was vital to the project. Jon was instrumental in showing how Parallella storage nodes and I/O Switch AoE Enablers could be deployed together in a real data center. Jon’s contributions of a real world use case and 3D CAD drawings tied the whole project together. In addition to solving many problems, Peter was able to prototype the entire software stack in a virtual machine environment, giving the team confidence that the project’s goals could be achieved. Rob and I set up the test lab, worked on troubleshooting, helped coordinate the team’s efforts, made emergency trips to the nearest Fry’s Electronics store, and prepared the team’s presentation.
The whole team worked late into the night on Tuesday, January 28. Dimitar and Andreas worked in shifts during the night getting the custom Linux kernel up and running and deploying Hadoop on the Parallella platform. A crucial moment came around 1:00 a.m. when the testbed Parallella computer overheated during a kernel compilation. We quickly solved the problem by lifting a fan from a prototype I/O Switch Hailstorm storage enclosure, borrowing some wire from another team and connecting the fan to the Parallella board.
After 24 hours of hard work, the team was still scrambling to wrap up as the presentations began, and we just finished our slides in time for a smooth presentation. Dimitar gave an in-depth live demo and handled the lengthy Q&A session including a discussion of how to use Adaptive Storage to implement Seagate’s Kinetic storage API or an Amazon S3-like RESTful API for scalable object storage.
In Adaptive Storage, disk drives are connected directly to network switches. There is no conventional storage array. Also connected to the switch are Parallella micro servers running Hadoop, each of which can handle data for one or more disk drives. Because every disk drive is individually connected to a network by an I/O Switch AoE Enabler PCB, any micro server can read from any disk drive. This means that micro servers can join forces to process complex jobs or larger data sets.
The idea is that micro servers can combine on demand and the recombination of micro servers takes place dynamically. Since additional micro servers can be recruited automatically to process complex jobs or large data sets, Adaptive Storage is elastic for compute. Additional physical micro servers can be added to the network switched fabric any time, independent of storage.
Similarly, any micro server can take over unassigned disk drives on the network for exclusive write access or release them when they are no longer needed. This is also done on demand so that Adaptive Storage is elastic for storage resources. Additional physical disk drives can be added to the network any time independent of compute.
In addition to dramatic power savings and independent, elastic scaling of compute and storage resources, Adaptive Storage is a simple and elegant way to eliminate compute hotspots and coldspots in Hadoop. But the concepts and methods of Adaptive Storage are not limited to Hadoop. It can be applied to virtually any big data technology, such as Cassandra or MongoDB or to object storage in general. For example, Adaptive Storage is complementary to Seagate Kinetic because the Kinetic API can run on micro servers managing one or more disk drives connected to a network.
For production, Facebook’s standard 1/2 width Knox OCP Serial Attached Small Computer System Interface (SAS) Expander Board (SEB) can easily be replaced by a full-width Adaptive Storage Base Board, on which card guides and a riser card / backplane can be mounted. The entire structure can be supported on the sides by sheet metal brackets. Production readiness is straightforward from a mechanical and manufacturing perspective.
Adaptive Storage raises fundamental questions about the way storage and compute are connected and about the power requirements for big data. In just 24 hours, with no budget and with a few boxes of computers, circuit boards and networking equipment, our small team of engineers was able to imagine a totally new way of organizing Hadoop data nodes, build and demonstrate a working prototype running on ARM processor-based micro servers using open source software, and show production-ready engineering CAD drawings for a production implementation.
We know from experience that a similar project in a large technology company could take several months. That may be why we received so much interest in our hack from those at the OCP Summit. We can’t say for sure what the future holds for Adaptive Storage, but we were excited that so many companies and individuals who viewed our hack were intrigued by what they saw and wanted to continue working with it. This reinforced our belief in open sourcing and in the Open Compute Project. We were able to use open source technology to build something great and we’ll keep building, innovating and developing with even more partners in the future.