News and Articles

Misconceptions of Corporate Data Science

“The internet is the greatest technical and social changing mechanism since the Gutenberg press.” I heard that somewhere, and then again, and so many times I’m not sure as to the origin. It has become a truism. It is on this foundation that data science became possible. The math and the approaches to neural network programming date back to the 1950s. However, only recently has computer processing power and the access to data (largely via the internet) made it possible. Machine learning is unlike the programming I did as a kid on the Apple IIe and the programming I did for years on the job. That type of programming required me to take business or game logic and code it into a set of rules. Machine Learning is fundamentally different. You code the inputs and features, define the desired type of output and let the network models decide how to arrive at the output. This is an oversimplification –  however, the purpose is to show the foundational difference between machine learning programming and previous programming styles. It is a style that deserves all the hype it is getting. Data Science, Machine Learning, Deep Learning, AI…whatever title you prefer…is simply fantastic.

Not to be repetitiThumbs Down on Data Scienceve, there are countless blogs and articles about this. Myriads of resources that will all say what a wonderful thing Data Science is and how if you are not using it now, then you are missing out. Data Science…Yah! Waiting to hear that this is hoopla? Well, it’s not. What would be great though is for one of these articles to speak to the problems of Data Science in the real world and how those problems can be overcome. This is where I like to weed out my audience. Especially friends who may be reading this and wondering what it is all about. If you work at an organization that is struggling to find a place for data science, I’m about to give you some real-world tips to help you along. The best thing you’ll see with these tips is that they are little to no cost.


Tip #1: Don’t think you don’t need Data Science.

“We don’t have a big data problem. We don’t have a need for Data Science.” This is near verbatim what I heard multiple times at mid-size companies. The problem with this thinking is that it assumes only companies like Google, Amazon and Netflix have a reason to use data science. It also assumes that these behemoths are the only ones who can use it. That’s not only an injustice to the craft itself, it slows down the progress and proliferation of the technology. You are only limited by the imagination of the management and directors. That is the biggest limitation of Machine Learning right now: a lack of imagination. If you can’t think of a use scenario for Data Science within your organization then it’s time to move on to Tip #2.


Tip # 2: Grow your own data scientist.

Learning to become a good data scientist takes years, so have patience. It is likely you have someone within your organization today who would love for the opportunity to learn this fascinating science. Identify that person, give then the time and resources to learn and listen to them when they have ideas around employing data science. If you don’t think it is worthwhile to give someone time on the job to be creative and learn, I invite you to read the Post-It story. Online resources like Udacity are an inexpensive ($200 a month) way to get a solid data scientist. Business analysts within your organization are extremely good candidates for data science, especially if they have programming experience. Reach out to local data science companies and resources for mentoring and guidance. They may not do this type of work explicitly, but most of them would be happy to work with you if you ask. It is an inexpensive way to build a data strategy.


Tip #3: Please Lord, do not hire a data science firm straight off to do everything for you.

Data Science Mad ScientistThis tip comes with a caveat. Do this only if you have a very clearly defined idea of what you need to have done…which is rare. I’ve seen renowned Silicon Valley data science firms charge more than 600K to do a landscape assessment. The results of which were worthless. They cannot create miracles in your environment. Working for Blue Diesel Data Science it is hard for me to say this, but I must…do not throw buckets of money at me or anyone else to do all the work for you. You must have the reins. If you need guidance then yes, hire a firm, to work with you in tandem. Look to us as mentors who will help greatly speed the process along. Do not just hand it off, this is a mistake.



Tip #4: Do not build an expensive platform from the start.

Time and time again I’ve heard of companies trying to start a data science initiative starting with the platform. “Of course, we need Hadoop, Spark and a Data Science VM from Azure, and the machine learning platform in AWS and…etc…etc.” No…you don’t. Unlike the previous non-cloud world of infrastructure. When you need Hadoop, you can spin up Hadoop. When you need a data science VM, you can spin one up. In the beginning, you just need a good laptop. In contrast, I’ve seen infrastructure groups spin up >4K a month Hadoop clusters only to let them spin unattended for years. The adage does not apply… “If you build it, they will come.” No…they won’t.


Tip #5: At the start, think tactically, not strategically.

You want your organization to have a data strategy, right? Then shouldn’t this be a strategic move? From a theoretical perspective yes, but reality dictates otherwise. If you need to prove or build a strategy in your organization from the ground up, you need to understand the ground first. Prove out some theories, convince some nay-sayers of the value, get some real work done first. I know a CIO that wanted a data strategy, but the rest of the organization did not understand and did not agree. There was an in-house grown data scientist who in a year developed the simplest classification algorithm that saved the company 10s of millions a year. Had his work been done just a couple years earlier it would have saved the company from a debacle that cost them a frightening amount. After that…everyone was on board. If the tide is against you, don’t fight it. Find a tactical problem, smash it, and then everyone will be behind you.


Tip #6: Just move.

I imagine an IT meeting that goes something like this:

Attendee 1: 
We need to foster a data-driven environment and embrace AI and Machine Learning. 

Attendee 2: 
What for? We don’t have a big-data problem. We don’t have anyone with experience. What for? 

Attendee 1: 
Nearly all the experts agree. AI and Data Science is going to be more transformative than the internet. We need to have a good understanding of its capabilities and how it can help our organization. We may not have a good, or any, idea now of how it will help us but we’ll never know if we don’t start walking this path. 

Attendee 2: 
Ok, but it sounds expensive. 

Attendee 1: 
We need to identify an individual who would love to spearhead this endeavor. Someone who would enjoy learning data science and applying that knowledge to help us develop a data-driven strategy. That’s how we start. It will cost us only the time that individual needs to get started. 

Attendee 3: 
Can I throw my hat in the ring? I’ve been interested in this for a while. I’d love to take this on. 

Attendee 1: 
This is how we start. 


Tip #7: Attend conferences.

Business ConfernceNow you are thinking, ok, this is supposed to be cheap but now I have to fly around and attend conferences? You most definitely have local conferences that are likely free. In Minneapolis, we have MinneAnalytics. They host multiple conferences a year on everything from financial to sport analytics. It will give you an opportunity to see what others are doing, meet people and understand your local community of data scientists

In the end, the biggest tip is to not be afraid of doing this or thinking it is beyond your company’s capabilities or needs. Like most things, you won’t know if you don’t try and data science is such a wonderful and exciting field you really have nothing to lose. This is a transformative time with a technology that is going to surpass the impact of the internet. To get started, ask a local data science firm to give you a free workshop on building a data strategy. Most will be happy to meet you and do this. Buy them lunch at least though. Some of them are starving artists. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *