6/12/2023 0 Comments Spark in the dark a cain![]() ![]() Sehrish and Kowalkowski want to simplify the lives of scientists running the analysis. With scientists like Gutsche at Fermilab, dark matter was a good place to start. “We need several real-use cases to understand the feasibility of using Spark for an analysis task,” Sehrish says. The Fermilab scientists selected Spark as an initial choice for exploring big-data science, and dark matter is just the first application under testing. ‘You have petabytes of data in specific experimental formats that you have to turn into something useful for another platform.’ “When you’re looking for more performance, you don’t get ease of use.” Researchers are still investigating some aspects of its performance for high-energy physics applications, but computer scientists can’t have everything. Spark appears to satisfy both ease-of-use and performance goals to some degree. ![]() Just being easy to use, though, is not enough when dealing with data from high-energy physics. In short, he and Sehrish want to build a system allowing researchers to run an analysis that performs extremely well on large-scale machines without complications and through an easy user interface. Kowalkowski summarizes Spark’s desirable features as “automated scaling, automated parallelism and a reasonable programming model.” Computer scientists like Sehrish and Kowalkowski can add capabilities, but making the underlying code work as efficiently as possible requires Spark specialists, some of whom work at NERSC. “This gives us a support team that can tune it,” Kowalkowski says. Spark takes care of that.Īnother attractive feature: Spark is a supported research platform at the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science user facility at the DOE’s Lawrence Berkeley National Laboratory. “You don’t want scientists to worry about how to distribute data and write parallel code,” she says. “One was in-memory, large-scale distributed processing” through high-level interfaces, which makes it easy to use. Spark offered promise from the beginning, with some particularly interesting features, Sehrish says. Fermilab computer science researchers Saba Sehrish and Jim Kowalkowski are tackling the task. That creates a challenge – accessing the high-energy physics data, which are in an object-oriented format. Spark is a data-reduction tool made for unstructured text files. To search for dark matter, scientists collect and analyze results from colliding particles, an extremely computationally intense process. “If our graduate students and postdocs only know our proprietary tools, then they’ll have trouble if they go to industry,” where such software is unavailable, Gutsche notes. Much of the work in high-energy physics, though, depends on software the scientists develop. “Two PCs can each process a collision,” meaning researchers can employ a computer grid to analyze data. “This is trivial to parallelize,” breaking the job into pieces to get answers faster, Gutsche explains. In searching for dark matter, physicists study results from colliding particles. They are exploring computational tools for the job, including Apache Spark open-source software. Once that information is available, physicists must mine it. To learn about dark matter, Gutsche needs more data. “With dark matter, we don’t know what we’re looking for.” “The Higgs boson had been predicted, and we knew approximately where to look,” he says. Analyzing the results demands high-performance computing – sometimes balanced with industrial trends.Īfter four years of running computing for the Large Hadron Collider CMS experiment at CERN near Geneva, Switzerland – part of the work that revealed the Higgs boson – Oliver Gutsche, a scientist at Department of Energy’s (DOE) Fermi National Accelerator Laboratory, turned to the search for dark matter. To find answers, scientists run huge high-energy physics experiments. Yet we know little about dark matter and energy. Most of the universe is dark, with dark matter and dark energy comprising more than 95 percent of its mass-energy. ![]()
0 Comments
Leave a Reply. |