this is a really good question and I think lies close to the heart of why so many data projects fail. If we can't define what a data product is, then what's our mission anyway?
Zhamak's answer is a bit open ended but its not wrong. Thinking through the difference between a data product and a typical application, two big things jump out to me:
1) Applications are much more deterministic. They are built to solve a specific need. Data products are much more opportunistic. They try to address emerging need
2) Applications are largely standalone and provide a guided experience. You're suppose to use them in a specific way. Data products are incomplete. Or rather they are incomplete without the user. The user is PART of the real product.
Totally. That’s what makes it hard though. If a data product isn’t well bounded / complete, then you can’t build a standard deployment system around it, while if it is bounded / complete, we’ll just end up with a system we already have, right? (eg, a database, data portal)
Of course, treating the data (tables, streams, etc) as a first class consideration is great and valuable. But the challenge then becomes proliferation and accountability more so than deployment.
Totally. If they want to help deploy / run things, you'll build an orchestrator. If they want to help index the most important things, they'll build a catalog. If they want to federate queries, they'll build Presto. If they want to do all of these things, they build a cloud data platform?
heh. btw have you seen the articles on the comedian john mulaney’s takedown of salesforce (and tech ) at their own conference? I liked “some of the vaguest language has been used here in the last three days. The fact that there are 45,000 trailblazers here couldn’t devalue the title anymore”
Ok. Where does docker come in? 🤔
this is a really good question and I think lies close to the heart of why so many data projects fail. If we can't define what a data product is, then what's our mission anyway?
Zhamak's answer is a bit open ended but its not wrong. Thinking through the difference between a data product and a typical application, two big things jump out to me:
1) Applications are much more deterministic. They are built to solve a specific need. Data products are much more opportunistic. They try to address emerging need
2) Applications are largely standalone and provide a guided experience. You're suppose to use them in a specific way. Data products are incomplete. Or rather they are incomplete without the user. The user is PART of the real product.
Totally. That’s what makes it hard though. If a data product isn’t well bounded / complete, then you can’t build a standard deployment system around it, while if it is bounded / complete, we’ll just end up with a system we already have, right? (eg, a database, data portal)
Of course, treating the data (tables, streams, etc) as a first class consideration is great and valuable. But the challenge then becomes proliferation and accountability more so than deployment.
i still don’t understand what nextdata is doing. Given its presumably proprietary nature I can’t find out for myself either.
Totally. If they want to help deploy / run things, you'll build an orchestrator. If they want to help index the most important things, they'll build a catalog. If they want to federate queries, they'll build Presto. If they want to do all of these things, they build a cloud data platform?
heh. btw have you seen the articles on the comedian john mulaney’s takedown of salesforce (and tech ) at their own conference? I liked “some of the vaguest language has been used here in the last three days. The fact that there are 45,000 trailblazers here couldn’t devalue the title anymore”
Incredible. Wish I could see the full set