We are building a B2B machine learning product and are running pilots with a customer (currently building models for their data on our servers).<p>We expect this customer will want to integrate our preprocessing and (rather complex) model generation process into their on-premise pipeline.<p>How do we protect our IP in this? We can of course go through some lengths to generate C++ executables or hide model-definitions in proprietary formats which seems like the wrong focus right now.<p>We expect for some customers with no background in ML (beyond simple supervised examples), this will not be an issue, since the generated architectures mean nothing to them.<p>For others with dedicated data science teams (which just lack the expertise in the niche we are in), they could probably figure out how our process works if we simply deploy python scripts generating TensorFlow models. We also suspect this depends on the sector where for some customers, it is simply too far outside their expertise to bother trying and cheaper paying us license+service fees, others not so much.<p>How do we protect our know-how here? Does it make sense to invest the effort to "hide" models behind abstractions and formats or should we simply focus on growth and ignore this?
Personally, I wouldn't bother. If you're dealing with reputable companies, just make sure the contract / license agreement clarifies what IP you own and what they're allowed to do with it. Most companies aren't going to blatantly violate a contract like that. And if you're dealing with less than reputable companies... stop doing that?
How long would it take to add a compile step to your build process?<p>couple days or a week? seems like its minimal effort and helps keep companies and their employees honest. probably need to make sure your covered on the contract side too.