On-Site Workshop Attendees
- Anastasia Alexov, University of Amsterdam, LOFAR Data Formats Group Lead, LOFAR Beam-Formed Pipeline Group co-Lead, and Software Developer
The LOFAR project has challenges across all the categories being discussed at the Big Data workshop: multi-core processing, pipelines, data organization, data management and data visualization. My particular focus is on the LOFAR data formats. We have written the specifications for the various LOFAR data formats to be stored in HDF5; we have only recently started writing data to these specifications. We would like to find out about other project's experiences in using HDF5. Ultimately we would like to form partnerships which will help guide these formats to grow into a true set of standards for radio data that can meet the demands of the next generation of radio observatories as well as the non-radio community such as LSST and other future missions.
- Thomas Bennett, South African SKA Project Office
I currently work in the MeerKAT Science Processing Team (SPT). The SPT deals with all activities related to processing and storing data products from the MeerKAT and its precursor instruments. My current tasks in the SPT include: (1) Data storage and retrieval and (2) Investigations in high performance computing for data processing and storing. My interests in attending the workshop is to get an opportunity to learn from others and establish links with people who are solving similar problems.
- Bruce Berriman, California Institute of Technology
I am an astronomer and computer scientist at the Infrared Processing and Analysis Center (IPAC), Caltech, where I am the VAO Program Manager and the NASA Star and Exoplanet Database (NStED) Project Manager. My science interests are in the area of using multi-wavelength archival data to search for new brown dwarfs. My computing interests are in understanding how to apply emerging technologies to astronomy, and in developing approaches to sustainability in software.
- Francesca Boffi, Space Telescope Science Institute
The Information Technology division at Space Telescope Science Institute has been tasked by the Director's office to draw a strategic plan for science computing.
As we are not only the science operations center for Hubble, but also the host of NASA's optical/UV archive (with holdings of around 164TB and daily retrievals of around 120GB) and a vibrant research institution, we have identified as top priorities grid computing, cloud computing and GPUs, and in particular, we plan to identify what the optimal (in terms of cost and performance) architecture is for connecting mass storage with processors in a primarily academic/research environment such ours.
Given the topics of the workshop, I think I will benefit greatly from hearing what others have been doing in some of these areas and what other Institutions are doing to address similar science computing needs, and how they are drawing plans for the future.
I am the IT Project Scientist at the Space Telescope Science Institute. In such role, my main responsibility is to create an interface between the Science Staff and the IT division, and ensure that the science-driven IT needs of our science staff are well addressed and satisfied. In my role, I bridge the gap between these two worlds and communicate with both sides. I see myself as an enabler of science.
Prior to this, I was Branch Lead of the Research and Instrument Analysis Branch, a group of 30 people. I was responsible for their assignments both to the various instrument teams in the Instruments Division, and to working on research projects with our science staff. I hold a PhD in Astronomy, and my astronomical research interest is supernovae and their environments. I have been at the Institute since 1995, in a variety of roles, and I am grateful I have the opportunity to support science.
- Bryan Butler, NRAO
- Andrew Connolly, University of Washington
- Paul Coster, Swinburne University of Technology
I am a PhD student with a background in computer science. My PhD project involves searching for accelerated pulsars. This search will involve reprocessing the 320 TB dataset from the High Time Resolution Universe survey, adding an unknown orbital acceleration parameter. A GPU cluster will be used to perform this processing. We intend to select the most promising candidates using data mining techniques. My interest in the workshop is therefore to learn more about the latest methods of data processing and to make contact with people who are interested in working on similar problems.
- Bruce Elmegreen, IBM Watson Research Center
- Mike Folk, The HDF Group
- Brian Glendenning, NRAO
Brian Glendenning has been in charge of ALMA software from the beginning of that project. He is attending this meeting both from the context as being responsible for the software that will provide all ALMA data to its community, and to try to see into the future since ALMA will inevitably very significantly increase its data rate in operations.
- David Halstead, NRAO
- Robert Hanisch, VAO, Space Telescope Science Institute
Dr. Robert J. Hanisch is a Senior Scientist at the Space Telescope Science Institute (STScI), Baltimore, Maryland, and is the Director of the US Virtual Astronomical Observatory, a program funded by the National Science Foundation and the National Aeronautics and Space Administation. In the past twenty years Dr. Hanisch has led many efforts in the astronomy community in the area of information systems and services, focusing particularly on efforts to improve the accessibility and interoperability of data archives and catalogs. He was the first chair of the International Virtual Observatory Alliance Executive Committee (2002-2003) and continues as a member of the IVOA Executive. From 2000 to 2002 he served as Chief Information Officer at STScI, overseeing all computing, networking, and information services for the Institute. Prior to that he had oversight responsibility for the Hubble Space Telescope Data Archive and led the effort to establish the Multimission Archive at Space Telescope.MAST.as the optical/UV archive center for NASA astrophysics missions. He has served as chair of the Program Organizing Committee for the Astronomical Data Analysis Software and Systems conferences, chair of the Space Science Data Systems Technical Working Group, chair of the Astrophysics Data Centers Coordinating Committee, chair of the Publications Board of the American Astronomical Society (AAS), and co-chair of the US Decadal Survey Study Group on Computation, Simulation, and Data Handling. He is currently chair of the International Astronomical Union Commission 5 Working Group on Virtual Observatories, Data Centers, and Networks and co-chair of the Working Group on Libraries, and chair of the AAS Working Group on Astronomical Software. He completed his Ph.D. in Astronomy in 1981 at the University of Maryland, College Park.
- Andrew Hart, Jet Propulsion Laboratory
- Gerd Heber, The HDF Group
- Gareth Hunt, NRAO
- Ercan Kamber, RAID
Dr. Ercan Kamber, CTO at RAID, Inc. is a leading expert on scalable storage solutions that work on Linux platforms and is a member of OpenSFS technical working group. The path he took while doing Physics research provided him with extensive technical experience in scalable storage solutions, HPC and research computing. At RAID, Inc. he architected and deployed turnkey end-to-end solutions that solve customers' scalable storage and computational needs. He truly believes in a consultative approach and careful business analysis for architecting the right solution. He enjoys strategizing with customers to overcome their most challenging data, IT and engineering problems and turning cutting edge technologies into products and solutions. He does not shy away from getting his hands dirty and tinkers with new technologies constantly. "Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination." --Albert Einstein
- Brian Kent, NRAO
I am currently a Jansky Fellow at the NRAO in Charlottesville. I'm interested in studying the characteristics of nearby galaxies. Most of my astronomical data comes from surveys with radio telescopes, so I am interested in data visualization, wide field mosaicking, and Python. Swing by my website or poster for this meeting to learn more about my interests!
- Jeff Kern, NRAO
- Cameron Kiddle, University of Calgary, Technical Coordinator, CyberSKA
I am currently a Research Fellow for the Grid Research Centre and an Adjunct Assistant Professor in the Department of Computer Science at the University of Calgary. I am also Technical Coordinator for CyberSKA, a collaborative initiative among several North American institutions that is exploring the cyberinfrastructure required to address the large data needs of current and future radio telescopes, such as the Square Kilometre Array (SKA). I am interested in the workshop as it directly relates to the data and processing challenges that we are trying to address as part of the CyberSKA project.
Amy Kimball, NRAO
I am a postdoc at the NRAO working in the NAASC (North American ALMA Science Center). A lot of my research involves analysis of survey data from multi-wavelength wide-field sky surveys, with a particular focus on population statistics of active radio galaxies. I'm attending this workshop partly to stay up-to-date on techniques that will improve my own research, and also to satisfy a general interest in learning more about applications of data mining and data management.
- Mark Lacy, NRAO
Mark Lacy leads the Data Services group at the North American ALMA Science Center (the NAASC). He is responsible for coordinating the installation of the ALMA archive mirror and the operation of a copy of the ALMA pipeline at the NAASC, and for the development of tools for analyzing ALMA datacubes within North America. ALMA will be among the first wave of petabyte-scale public astronomical archives, and the complex nature of ALMA data processing will present some unique challenges. The analysis and visualization of the large datacubes ALMA can produce, however, also presents significant difficulties that we are yet to tackle. We are thus particularly interested in talking with external groups interested in this problem.
Joseph Lazio, Jet Propulsion Laboratory
Dr. Joseph Lazio's research interests are in studying fundamental physics via radio pulsars and 21-cm cosmology via observations at low radio frequencies. He serves currently as the Project Scientist for the Square Kilometre Array and has used a number of leading radio telescopes (including the GBT). Existing and emerging radio telescopes are already providing significant challenges in a variety of areas related to data processing and management. It is clear that new algorithms and techniques need to be developed and applied.
Fred Lo, NRAO
- Chris Mattmann, Jet Propulsion Laboratory
Chris Mattmann has a wealth of experience in software design, and in the construction of large-scale data-intensive systems. His work has infected a broad set of communities, ranging from helping NASA unlock data from its next generation of earth science system satellites, to assisting graduate students at the University of Southern Calfornia (his alma mater) in the study of software architecture, all the way to helping industry and open source as a member of the Apache Software Foundation. When he's not busy being busy, he's spending time with his lovely wife and son braving the mean streets of Southern California.
My interest in the workshop is that I'm PI'ing an internally funded JPL task to investigate the use of our OODT technology (recently released as the first NASA project to the open source Apache Software Foundation) in building SKA data systems. We've had a lot of success using OODT over the last 10 years for planetary science, earth science missions, cancer research and in a number of other domains. We're really excited about the recent open source success and the potential that it opens up for collaborations. I'm also very excited as I meet more and more people from the radioscience community and understand their needs and challenging and interesting science problems!
- Reagan Moore, RENCI
My interests are: Application of policy-based data management to Astronomy image collections for use within data sharing environments, digital libraries, data processing pipelines, and preservation environments. Examples include DPOSS archive, 2MASS mosaic, and LSST data processing pipeline.
- Karen O'Neil, NRAO
Vincent Oria, New Jersey Institute of Technology, Associate Professor of Computer Science
I am an Associate Professor of Computer Science and my research interests include multimedia databases, spatial databases and recommender systems. One of my research topics that I would like to apply to images of Astronomy is "Knowledge Propagation In Large Image Databases Using Neighborhood Information."
The propagation to an entire image database of semantic information associated to some objects of interest has several applications ranging from home photo album management to security. Existing solutions to adding semantic information to an image database are labor intensive and not always accurate. The aim of this work is to reduce to a minimum the level of human intervention in the semantic annotation process. Ideally, only one copy of each object of interest would be labeled through human intervention, and the labels would then be propagated to all other occurrences of the objects in the image database. To that end, we propose a neighborhood-based influence propagation approach called KProp which builds a voting model and effectively propagates the knowledge associated to some objects to similar objects in the database. Each object iteratively collects opinions from neighbors, makes a decision on its status and provides this information to the others. We show that this procedure can perform efficiently through matrix computations. KProp is applicable as long as pairwise similarities of objects are available and requires no human interactions besides the original labeling. We applied KProp to simple object and face classifications. The experimental results show that KProp achieves better results with fewer labeled examples per object.
Rob Pennington, National Science Foundation
Bob Picardi, RAID
David Pugmire, Oak Ridge National Laboratory
James Robnett, NRAO
- Sergiu Sanielvici, Pittsburgh Supercomputing Center
My main interest in attending the meeting follows from my roles as User Advocate for the Teragrid Technology Insertion Service and as leader of advanced support for novel and innovative projects in the next phase of the Teragrid project ("XD"). I need to understand the requirements of the projects that will be discussed at the meeting, and what parts of their work and data flow might be supported by the Teragrid.
Nigel Sharp, National Science Foundation
Amy Shelton, NRAO, Green Bank Software Development Division Head
As instrumentation continues to improve for our telescopes at the National Radio Astronomy Observatory, our data rates grow. At the Green Bank Telescope, we have a focal plane array development program as well as a program for developing the next generation of backends based upon FPGA technology. The National Radio Astronomy is looking for partnerships and collaborations to bring this next generation of instrumentation online for our user community. In particular, we are interested in intelligent automatic processing of data, near real-time data inspection and analysis, post-processing of massive data sets, and archiving of massive data sets.
Alex Szalay, Johns Hopkins University
Jill Tarter, Center for SETI Research
Jim Ulvestad, National Science Foundation
- Michael Wise, Netherlands Institute for Radio Astronomy
Dr. Michael Wise is a staff astronomer at ASTRON the Netherlands Institute for Radio Astronomy and Project Scientist for the International LOFAR Telescope. LOFAR, the Low Frequency Array, is a next generation radio telescope under construction in the north of the Netherlands and across Europe designed to operate in the largely unexplored low frequency range from 30-240 MHz. As one of the first of a new generation of radio instruments, the International LOFAR Telescope (ILT) provides a number of unique capabilities for the astronomical community. These include among others remote configuration and operation, data processing that is both distributed and parallel, buffered retrospective all-sky imaging, dynamic real-time system response, and the ability to provide multiple simultaneous streams of data to a community whose scientific interests run the gamut from lighting in the atmospheres of distant planets to the origins of the universe itself. LOFAR is one of many new facilities that are facing the large-scale data management issues exemplified by this workshop today, not down the road. We are interested in technical collaborations as well as participating in the discussion about how to address these issues at the community level.
- Cliff Woolley, NVIDIA