Comparative Evaluation of Data Lakes and Data Warehouses: When to Use Each of them with Case Studies Showing Practical Uses
Keywords:
Data Lake, Data Warehouse, Big Data, Real-Time AnalyticsAbstract
Although they have been distinct functions, data lakes & the data warehouses are both very crucial to contemporary data managements. Organizations may choose tone of the best suits their requirements by being aware of their distinctions. Huge volumes of both organized & unstructured data may be stored in flexible & the affordable data lakes. When schema-on-read is needed to evaluate a variety of data formats that they are perfect for data science & exploratory analytics. However the data warehouses are ideal for business intelligences(BI) & analytics that need high speeds & dependability since they store structured data & they provide quick, precise querying capabilities. Case examples demonstrate the efficient usage of these technologies by enterprises. For example, a financial institution used a data warehouse to streamline procedures and save time while improving regulatory compliance and optimizing reporting. Depending on your use case, whether you want structured analytical performance or exploratory flexibility, data lakes and warehouses each offer distinct advantages. In a meanwhile the technology corporation used a data lakes to centralize raw data from several sources into a single repository for sophisticated analysis, therefore promoting ML innovations. These real-life examples are highlights the advantages of each strategy & assist the corporations in making the best decision. This comparison provides useful insights to help with the decision-making by highlighting best practices & typical hazards when assessing data lakes vs data warehouses. Businesses may be create a data infrastructure that supports their operational & analytical objectives while guaranteeing flexibility, performance & alignment with their particular requirements by defining the optimal times for each strategies.
References
1.Jarke, M., & Quix, C. (2017). On warehouses, lakes, and spaces: the changing role of conceptual modeling for data integration. Conceptual Modeling Perspectives, 231-245.
2. Pasupuleti, P., & Purra, B. S. (2015). Data lake development with big data. Packet Publishing Ltd.
3. Mohanty, S., Jagadeesh, M., & Srivatsa, H. (2013). Big data imperatives: Enterprise ‘Big Data’warehouse,‘BI’implementations and analytics. Apress.
4. Vaisman, A., & Zimányi, E. (2014). Data warehouse systems. Data-Centric Systems and Applications, 9.
5. Collier, K. (2012). Agile analytics: A value-driven approach to business intelligence and data warehousing. Addison-Wesley.
6. Dyché, J. (2000). e-Data: Turning data into information with data warehousing. Addison-Wesley Professional.
7. Lunce, S. E., Lunce, L. M., Kawai, Y., & Maniam, B. (2006). Success and failure of pure‐play organizations: Webvan versus Peapod, a comparative analysis. Industrial Management & Data Systems, 106(9), 1344-1358.
8. Rivest, S. (2001). Toward better support for spatial decision making: defining the characteristics of spatial on-line analytical processing (SOLAP). Geomatica, 55(4), 539-555.
9. Sujitparapitaya, S., Janz, B. D., & Gillenson, M. (2003). The contribution of IT governance solutions to the implementation of data warehouse practice. Journal of Database Management (JDM), 14(2), 52-69.
10. Prabhu, C. S. R. (2008). Data warehousing: concepts, techniques, products and applications. PHI Learning Pvt. Ltd..
11. Haarbrandt, B., Tute, E., & Marschollek, M. (2016). Automated population of an i2b2 clinical data warehouse from an openEHR-based data repository. Journal ofbiomedical informatics, 63, 277-294.
12. Alam, I., Antunes, A., Kamau, A. A., Ba Alawi, W., Kalkatawi, M., Stingl, U., & Bajic, V. B. (2013). INDIGO–INtegrated data warehouse of MIcrobial GenOmes with examples from the red sea extremophiles. PloS one, 8(12), e82210.
13. Mohanty, S. (2007). Data Warehousing: Design, development and best practices. South Asian Journal of Management, 144-146.
14. Hackathorn, R. (2002). Current practices in active data warehousing. Bolder Technology, 23-25.
15. Chen, H. M., Kazman, R., Haziyev, S., & Hrytsay, O. (2015, May). Big data system development: An embedded case study with a global outsourcing firm. In 2015 IEEE/ACM 1st International Workshop on Big Data Software Engineering (pp. 44-50). IEEE.
16. Gade, K. R. (2017). Integrations: ETL/ELT, Data Integration Challenges, Integration Patterns. Innovative Computer Sciences Journal, 3(1).
17. Gade, K. R. (2017). Migrations: Challenges and Best Practices for Migrating Legacy Systems to Cloud-Based Platforms. Innovative Computer Sciences Journal, 3(1).
18. Gade, K. R. (2018). Real-Time Analytics: Challenges and Opportunities. Innovative Computer Sciences Journal, 4(1).
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.