On LINE’s Data Platform, there are currently issues related to data utilization. First of all, due to the massive scale of LINE’s data, human management of metadata is difficult. In addition, our behavioral guidelines include “Always Data Driven”, so we need to provide a user experience that can meet the diverse data-related needs of numerous businesses. Strict authority management is an essential prerequisite for this.
To resolve these kinds of issues, we created an in-house data catalog. The collection and management of metadata is automated using various Hadoop ecosystem technologies such as hooking queries to Apache Atlas and generating data lineage. For the user experience, the catalog is personalized and linked to various systems via APIs, a modification which makes it easier to apply within existing business processes.
This session will introduce the concept and various functions designed at the time the data catalog was developed, as well as looking into some utilization case studies.