B2B companies typically work with several Big Data sources – marketing automations, sales platform, billing system, and customer service to name a few. The sheer volume of data from these systems can be intimidating and taxing to an IT department. On top of that, business leaders are asking to have more and more new data sources (e.g., social media, natural language processing, operations performance) included in reporting and analysis.
This has precipitated in growing the popularity of creating a data lake for a company’s data. The data lake is basically one central location where data is placed from every system, platform, file, spreadsheet, audio file, and whatever else is deemed potentially valuable data. The good news is, all the data is in one place for people to access it. The bad news, it’s up to you to figure out which data to use, how to prepare it, and how to analyze it.
Here are 5 things to know before deciding to swim in your company’s data lake.
Don’t dive in headfirst.
There’s a lot of data in that lake. Make sure you are familiar with the different sources of data before you start using it – know the background on each source, which files can be joined to others, and how big the data is. Most importantly, have a clearly defined plan for how you are going to use the data in the lake and what business questions you want answered.
Use the buddy system.
If you are not an expert swimmer (in this case, a frequent user and analyzer of data), bring a buddy with you so you don’t get in over your head. Your buddy could be someone from IT, a Sales Ops reporting whiz, or someone from your analytics group. Bounce ideas off your buddy about the business questions you are trying to answer and how the data in the data lake can best be used.
Don’t go past the buoys.
Trying to do too much too soon may put you on a never-ending analysis path (and it will alienate your buddy). Start with some basic questions or challenges and use the data lake to address those. You will learn how to navigate the lake while also being viewed as a data-driven decision maker. Then move on to the more involved and challenging problems.
Know what’s in the water.
The data lake stores the data, but it doesn’t cleanse, transform, standardize, or join the data sources. It is up to you to learn what data you want to use. Have your buddy guide you to understand the data sources or direct you to the respective subject matter experts who know the details and pitfalls. You may find new data sources that you hadn’t considered and there may be data files that are too incomplete and inconsistent to even bother with.
Listen to the lifeguard.
Your IT director may not look the part, but he or she is the (figurative) person with the whistle telling you where you can swim and what the rules are. The data lake is not a complete free-for-all. There are data security rules, customer privacy considerations, storage space, and server access that IT still oversees. Follow the rules and stay on the good side of the lifeguard.