Dear all,<p>I work at a physical commodity operation where we move around 8-10 million metric times of goods CFR. In this connection we are looking to digitize and automate many things in the company which I will not bore you with. However, we are also looking to step up our SnD analysis by incorporating various type of big data analysis, e.g. satellite data. There are some SaaS providers that offers 25% each of what we are looking for via API, so we need to work with multiple vendors. Therefore I am faced with a choice:<p>1. Either we buy from four different suppliers and solve our data needs quite fast, or…
2. We buy more raw data and invest in building the algorithms ourselves<p>There is a data science team of two people with a calendar that is 80-90% full already. Therefore I am trying to review the pro’s and con’s of building myself:<p>Pro’s:
- Full flexibility
- I own the algorithms
- New knowledge created in the organization which could lead to more innovation
- No risk of vendor login<p>Con’s:
- Higher costs
- A lot of man hours spent
- Unknown time frame for implementation
- We need extra people to maintain the algorithms<p>I estimate that the absolute minimum cost of data for building will cost around USD 100.000 (without extra labor costs) and for buying off the shelf USD 150.000.<p>I would be happy to hear if anyone in here is struggling with the same hurdles and discuss how they are approaching it?
Build will always be more expensive, finish later and require more maintenance than your worst case scenario. If your competitive edge over the competition lies within the bounds of what you are building, then it may be worth it.<p>Building it is one thing. Maintaining, updating, running the development team, hiring/firing etc. comes along as second order consequences.<p>So I would approach the question from the point of whether this is the "secret sauce" that keeps the competitive edge. If not, then I would buy it from others.
You should go COTS for this and just buy. Is your job building software or integrating software into an existing business? From the sounds of it. is the later. So I would think it sensible to treat this purchase like buying a fleet of cars. You don't want to get into the tarpit of having to design and deal with the supporting into the future, focuson the profit making part. Pick a good set of products that allows for you to integrate and extend.
Also, if you pick the right software, you'll have the ability to add on extensions when you're there. But the largest hurdle will be initial deployment and hitting that initial ROI. Which may be bigger if you don't buy, but if it's inhouse software, what's 50K over 20 years....<p>Think about maintenance cost over a longer horizon. How long will this software live for.
The biggest con for building is the opportunity costs. The COTS solution will be up and running far sooner than your in-house built solution. With the data science team's workload you will probably need to hire or contract extra resources. Doing so pushes your completion date even further into the future.<p>If there are any gaps in the off-the-shelf solutions, then perhaps you could have the in-house team bridge them and that will give you extra leverage down the track.
Consider the possibility that after integrating all the APIs of the off the shelf solutions you might be confronted with bottlenecks or other limitations you were not aware of when you made the decision. Therefore the safer route might be to build it yourself especially if you only need a small subset of the features offered by vendor solutions.
"Focus on what makes your beer taste better"<p>As /u/q-base said: '[...] I would approach the question from the point of whether this is the "secret sauce" that keeps the competitive edge. If not, then I would buy it from others.'