data lake
n. A massive amount of data stored and readily accessible in its pure, unprocessed state.
Also Seen As
To prepare for this onslaught, some IT leaders are urging the creation of "data lakes." These are centralized repositories based on Hadoop that draw raw data from source systems and then pass them to downstream facilities for utilization by the knowledge workforce.
—Ron Bodkin, “Getting The Most From Your Data Lake,” Forbes, May 29, 2015
The data lake strategy is part of a greater movement toward data liberalization. It started with the printing press and moving the books out of the monastery. Sure, there was confusion and a schism, but did we really want to wait for the monks to decide who gets the handwritten books?
—Andrew C. Oliver, “Gartner gets the 'data lake' concept all wrong,” InfoWorld, July 31, 2014
The second most common use case is one we call "Data Exploration." In this case, organizations capture and store a large quantity of this new data (sometimes referred to as a data lake) in Hadoop and then explore that data directly.
—Shaun Connolly, “The three most common ways data junkies are using Hadoop,” Gigaom, December 15, 2013
2010 (earliest)
Based on the requirements above and the problems of the traditional solutions we have created a concept called the Data Lake to describe an optimal solution.

If you think of a datamart as a store of bottled water — cleansed and packaged and structured for easy consumption — the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.
—James Dixon, “Pentaho, Hadoop, and Data Lakes,” James Dixon's Blog, October 14, 2010
Filed Under