Advanced Search

Narrow Results

Results: 1 - 3of3

Follow results:

refine search

Filters

per page:

Sort: Relevance

Context for search term 1Search term 1*

All Dates

LastSelect static range

Custom Range

Select starting monthSelect starting year

Select ending monthSelect ending year

Advanced

Search name	Searched On	Run search
Keyword: O Vacancy (3)	14 Apr 2025	Run
Keyword: Interests (2)	14 Apr 2025	Run
Keyword: Data Lake (3)	14 Apr 2025	Run
Keyword: Capsaicin (5)	14 Apr 2025	Run
Keyword: Isolation (30)	14 Apr 2025	Run

articleNo Access
On the road to a scientific data lake for the High Luminosity LHC era
International Journal of Modern Physics A30 Nov 2020
Preview Abstract
The experiments at CERN’s Large Hadron Collider use the Worldwide LHC Computing Grid, the WLCG, for its distributed computing infrastructure. Through the distributed workload and data management systems, they provide seamless access to hundreds of grid, HPC and cloud based computing and storage resources that are distributed worldwide to thousands of physicists. LHC experiments annually process more than an exabyte of data using an average of 500,000 distributed CPU cores, to enable hundreds of new scientific results from the collider. However, the resources available to the experiments have been insufficient to meet data processing, simulation and analysis needs over the past five years as the volume of data from the LHC has grown. The problem will be even more severe for the next LHC phases. High Luminosity LHC will be a multiexabyte challenge where the envisaged Storage and Compute needs are a factor 10 to 100 above the expected technology evolution. The particle physics community needs to evolve current computing and data organization models in order to introduce changes in the way it uses and manages the infrastructure, focused on optimizations to bring performance and efficiency not forgetting simplification of operations. In this paper we highlight a recent R&D project related to scientific data lake and federated data storage.
articleNo Access
SPOTLIGHTS
Asia-Pacific Biotech News01 Nov 2016
Preview Abstract
Kantar Health in Healthcare.
Saphetor: Leader of the Revolution in Clinical Diagnostics.
Healthcare Advancement: “Healthcare is a National Right, Not a Privilege”.
Data Showed Treatment Makes mCRPC Patients Live Longer.
articleNo Access
Megale: A Metadata-Driven Graph-Based System for Data Lake Exploration
- Doulkifli Boukraa,
- Meriem Bouraoui,
- Chaima Grine, and
- Racha Ouahab
International Journal of Information Technology & Decision Making13 Dec 2024
Preview Abstract
Data lakes are storage repositories that contain large amounts of data (big data) in its native format; encompassing structured, semi-structured or unstructured. Data lakes are open to a wide range of use cases, such as carrying out advanced analytics and extracting knowledge patterns. However, the sheer dumping of data into a data lake would only lead to a data swamp. To prevent such a situation, enterprises can adopt best practices, among which to manage data lake metadata. A growing body of research has focused on proposing metadata systems and models for data lakes with a special interest on model genericness. However, existing models fail to cover all aspects of a data lake, due to their static modeling approach. Besides, they do not fully cover essential features for an effective metadata management, namely governance, visibility and uniform treatment of data lake concepts. In this paper, we propose a dynamic modeling approach to meet these features, based on two main constructs: data lake concept and data lake relationship. We showcase our approach by Megale, a graph-based metadata system for NoSQL data lake exploration. We present a proof-of-concept implementation of Megale and we show its effectiveness and efficiency in exploring the data lake.