We discuss the nexus of big data applications, software and infrastructure where we identify 6 overall machine architectures. The big Data applications are drawn from a study from NIST and the layered software from a compendium of open-source, commercial and HPC systems. We illustrate with typical “big data analytics” Machine learning with varied Parallel Programming Models (MPI, Hadoop, Spark, Storm) on both cloud and HPC platforms. We discuss performance (of Java) and the use of DevOps scripts such as Chef/Ansible and OpenStack Heat to specify software stack. This leads to the interesting virtual cluster concept.
Fox received a Ph.D. in Theoretical Physics from Cambridge University and is now distinguished professor of Informatics and Computing, and Physics at Indiana University where he is director of the Digital Science Center and Senior Associate Dean for Research and Director of the Data Science program at the School of Informatics and Computing. He previously held positions at Caltech, Syracuse University and Florida State University after being a postdoc at the Institute of Advanced Study at Princeton, Lawrence Berkeley Laboratory and Peterhouse College Cambridge. He has supervised the PhD of 68 students and published around 1200 papers in physics and computer science with an h-index of 68 and over 26000 citations.
He currently works in applying computer science from infrastructure to analytics in Biology, Pathology, Sensor Clouds, Earthquake and Ice-sheet Science, Image processing, Deep Learning, Network Science and Particle Physics. The infrastructure work is built around Software Defined Systems on Clouds and Clusters. He is involved in several projects to enhance the capabilities of Minority Serving Institutions. He has experience in online education and its use in MOOC’s for areas like Data and Computational Science. He is a Fellow of APS and ACM.
The so-called “Moore’s Law”, by which the performance of the processors will increase exponentially by factor of 4 every 3 years or so, is slated to be ending in 10-15 year timeframe due to the lithography of VLSIs reaching its limits around that time, and combined with other physical factors. This is largely due to the transistor power becoming largely constant, and as a result, means to sustain continuous performance increase must be sought otherwise than increasing the clock rate or the number of floating point units in the chips, i.e., increase in the FLOPS. The promising new parameter in place of the transistor count is the perceived increase in the capacity and bandwidth of storage, driven by device, architectural, as well as packaging innovations: DRAM-alternative Non-Volatile Memory (NVM) devices, 3-D memory and logic stacking evolving from VIAs to direct silicone stacking, as well as next-generation terabit optics and networks. The overall effect of this is that, the trend to increase the computational intensity as advocated today will no longer result in performance increase, but rather, exploiting the memory and bandwidth capacities will instead be the right methodology. However, such shift in compute-vs-data tradeoffs would not exactly be return to the old vector days, since other physical factors such as latency will not change. As such, performance modeling to account for the evolution of such fundamental architectural change in the post-Moore era would become important, as it could lead to disruptive alterations on how the computing system, both hardware and software, would be evolving towards the future.
Satoshi Matsuoka has been a Full Professor at the Global Scientific Information and Computing Center (GSIC), a Japanese national supercomputing center hosted by the Tokyo Institute of Technology, since 2001. He received his Ph. D. from the University of Tokyo in 1993. He is the leader of the TSUBAME series of supercomputers, including TSUBAME2.0 which was the first supercomputer in Japan to exceed Petaflop performance and became the 4th fastest in the world on the Top500 in Nov. 2010, as well as the recent TSUBAME-KFC becoming #1 in the world for power efficiency for both the Green 500 and Green Graph 500 lists in Nov. 2013. He is also currently leading several major supercomputing research projects, such as the MEXT Green Supercomputing, JSPS Billion-Scale Supercomputer Resilience, as well as the JST-CREST Extreme Big Data. He has written over 500 articles according to Google Scholar, and chaired numerous ACM/IEEE conferences, most recently the overall Technical Program Chair at the ACM/IEEE Supercomputing Conference (SC13) in 2013. He is a fellow of the ACM and European ISC, and has won many awards, including the JSPS Prize from the Japan Society for Promotion of Science in 2006, awarded by his Highness Prince Akishino, the ACM Gordon Bell Prize in 2011, the Commendation for Science and Technology by the Minister of Education, Culture, Sports, Science and Technology in 2012, and recently the 2014 IEEE-CS Sidney Fernbach Memorial Award.
The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy-efficiency, and reliability significantly more costly with conventional techniques.
In this talk, we examine some promising research and design directions to overcome challenges posed by memory scaling. Specifically, we discuss three key solution directions: 1) enabling new memory architectures, functions, interfaces, and better integration of the memory and the rest of the system, 2) designing a memory system that intelligently employs multiple memory technologies and coordinates memory and storage management using non-volatile memory technologies, 3) providing predictable performance and QoS to applications sharing the memory/storage system. If time permits, we might also briefly touch upon our ongoing related work in combating scaling challenges of NAND flash memory.
An accompanying paper can be found here: http://users.ece.cmu.edu/~omutlu/pub/memory-systems-research_superfri14.pdf
Onur Mutlu is the Strecker Early Career Professor at Carnegie Mellon University. His broader research interests are in computer architecture and systems, especially in the interactions between languages, system software, compilers, and microarchitecture, with a major current focus on memory systems. He obtained his PhD and MS in ECE from the University of Texas at Austin and BS degrees in Computer Engineering and Psychology from the University of Michigan, Ann Arbor. Prior to Carnegie Mellon, he worked at Microsoft Research, Intel Corporation, and Advanced Micro Devices. He was a recipient of the IEEE Computer Society Young Computer Architect Award, Intel Early Career Faculty Award, faculty partnership awards from various companies, including Facebook, Google, HP, Intel, IBM, Microsoft and Samsung, a number of best paper recognitions at various computer systems venues, and a number of “computer architecture top pick” paper selections by the IEEE Micro magazine. For more information, please see his webpage at http://www.ece.cmu.edu/~omutlu.