on Friday, July 9, 2010


Do not get me wrong, I'm thrilled to see some of the
ideas that I've talked about finally getting traction
in the wider community, and pleased that someone came up
with a different name than Megadata, but in looking at
many of the discussions it appears that there's way too
much focus on the "SQL" part of NoSQL. While breaking away
from looking at data merely in a relational manner is a good
start, it misses the larger picture, which is that working
with Megadata, really large data sets, the nature of the problems
and the nature of the solutions markedly changes.



On the simplest of level you have to get beyond N=1 thinking, you
can't possibly store all that data on one machine, nor can you
hope to process it through one machine. Whatever software you
build to house and handle all that data must be an
N > 1 endeavor.



But also missing is the idea that there are problems that are
simpler to solve, or are only possible
to solve, once you have enough data
. That brings me back to
one of the most exciting annoucements at Google IO, which was the
Prediction API; in
conjunction with Google
Storage for Developers
it puts the tools for leveraging Megadata into
more hands.


So I'm glad to see NoSQL getting traction, but I'm looking forward
to the day when the discussion moves beyond what was left behind,
and moves on to what new things we can build going forward.