Ferd Scheepers, ING Group | IBM Think 2018
Ferd Scheepers, Chief Information Architect at ING Group, sits down with Dave Vellante and Lisa Martin on day one of IBM Think 2018 at Mandalay Bay Resort and Casino in Las Vegas, Nevada #Think18 #theCUBE https://siliconangle.com/2018/04/02/slay-big-data-swamp-thing-governance-protips-think2018-guestoftheweek/ Slay the big data ‘Swamp Thing’ with these governance protips Now that many companies find themselves with expansive data lakes in this era of big data, what should they do to keep these information reservoirs from coagulating into sticky swamps? Scratch that — what if the ship has sailed, and they’re already up a messy, confusing data creek without a paddle? Without further ado (and without further belaboring the metaphor), here are protips from The ING Group on how to govern data lakes for compliance and analytics. The Dutch multinational banking and financial services corporation headquartered in Amsterdam began building out its data lake and governance strategy about six years ago. It selected IBM Corp. to godfather it — the company supplied the data aggregation and labeling technologies. ING did not rake in gains from the project overnight; it took several years, and the company still has holes to patch, according to Ferd Scheepers (pictured), chief information architect at ING. “If you believe you can do this journey and have value after a year and then you’re done — it doesn’t work that way,” Scheepers said. That is not to say it isn’t worth the effort — ING has improved the efficiency of data governance and analytics across all departments. (At any rate, businesses can’t afford to slack off with General Data Protection Regulation set to descend on them in May.) The recipe calls for a top-down executive decision and a clean and sober selection of appropriate technologies, according to Scheepers. Scheepers spoke with Dave Vellante (@dvellante) and Lisa Martin (@LuccaZara), co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, at the IBM Think event in Las Vegas. They discussed how to govern big data and turn compliance fright to innovative might. (* Disclosure below.) This week, theCUBE spotlights Ferd Scheepers as our Guest of the Week. The labeling game gets all on the same page Getting everyone within the ING empire on board with the data governance architecture required tweaking the pitch for different regions, departments, etc. “Selling the architecture actually means that you need to go to the different stakeholders with very different stories. So what’s in it for them?” Scheepers said. For example, chief information officers gain more navigable landscape with automation replacing a lot of manual drudgery; the increased control means all their risk items go down, he explained. The business side gets well-articulated context around data — and they actually get to own the data and say who gets access to it and what they can do with it. A crucial step to governing data for use across an organization is getting everyone on the same page semantically. In other words, the business needs to bring all data sources together and qualify them with business terms so that people can understand what they are. That sounds simple enough, but the reality for large, branched-out corporations like ING is a bit complicated. Infusing a common language across all lines of business and across all countries was tricky, Scheepers pointed out. Even a simple term like “customer” can be subject to different interpretations. “I mean that sounds very natural for a bank to understand what a customer is,” he said. “But you might have very different definitions based on where you come from and which country.” ING is increasingly experimenting with data discovery tools that automatically classify data and tie it back to business terms. It still relies heavily on manual labeling, however, due to the massive quantity and diversity of data at ING. “As a bank, you probably have thousands of things that you could describe on a business term level,” Scheepers stated. A hierarchy of priority for depth of description is a sensible way to keep from biting off a mouthful of data that governance can’t chew. “When you talk about customer data, you want to know all the different details about … what is a salary? Does an account include accrued interest?” Scheepers said. On the other hand, log data, for instance, can slide with a less detailed description, he added. ... Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the IBM Think event. (* Disclosure: TheCUBE is a paid media partner for IBM Think. Neither IBM, the event sponsor, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)