Data Integration Techniques: Exploring strategies and tools for coordinating data from many sources and systems

Authors

  • Muneer Ahmed Salamkar Senior Associate at JP Morgan Chase, USA Author
  • Karthik Allam Big Data Infrastructure Engineer, JP Morgan & Chase, USA Author

Keywords:

Data integration, ETL, ELT, data quality

Abstract

Modern organizations need data integrations because it allows them to combine data from many sources for improves analysis & decision-making. Integrating data from several platforms like databases & apps is very essential for overcoming obstacles like silos, inconsistencies & disparate the formats as businesses become more data-driven. ELT (extract, load, transform) is an evolutions of traditional ETL (extract, transform, load) techniques that more effectively manage larger data volumes. The way integrated data is stored & accessible is changing due to their technologies like data lakes, warehouses & cloud-based platforms. Event-driven architectures are API-based integration & data virtualization is the examples of actual time solutions that provide smooths system access & consistency. Additionally, automations & ML are decreasing human labor, enhancing data qualities & simplifying integration procedures. In order to assist firms in selecting techniques that meets their particular requirements, this conversation examines important integrations tactics & the newest technologies. Businesses may be increase performances, makes better decisions & handles data in a scalable & economical manner by using these kind of techniques.

References

1. Prasser, F., Kohlbacher, O., Mansmann, U., Bauer, B., & Kuhn, K. A. (2018). Data integration for future medicine (DIFUTURE). Methods of information in medicine, 57(S 01), e57-e65.

2. Misra, B. B., Langefeld, C., Olivier, M., & Cox, L. A. (2019). Integrated omics: tools, advances and future approaches. Journal of molecular endocrinology, 62(1), R21-R45.

3. Dubrow, J. K., & Tomescu-Dubrow, I. (2016). The rise of cross-national survey

data harmonization in the social sciences: emergence of an interdisciplinary methodological field. Quality & Quantity, 50, 1449-1467.

4. Goble, C., & Stevens, R. (2008). State of the nation in data integration for bioinformatics. Journal of biomedical informatics, 41(5), 687-693.

5. Deelen, P., Bonder, M. J., Van Der Velde, K. J., Westra, H. J., Winder, E., Hendriksen, D., ... & Swertz, M. A. (2014). Genotype harmonizer: automatic strand alignment and format conversion for genotype data integration. BMC research notes, 7, 1-4.

6. Salinas, S. O., & Lemus, A. C. (2017). Data warehouse and big data integration. Int. Journal of Comp. Sci. and Inf. Tech, 9(2), 1-17.

7. Seligman, L., Mork, P., Halevy, A., Smith, K., Carey, M. J., Chen, K., ... & Burdick, D. (2010, June). Openii: an open source information integration toolkit. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data (pp. 1057-1060).

8. Yang, H., Li, S., Chen, J., Zhang, X., & Xu, S. (2017). The standardization and harmonization of land cover classification systems towards harmonized datasets:A review. ISPRS International Journal of Geo-Information, 6(5), 154.

9. Laniak, G. F., Olchin, G., Goodall, J., Voinov, A., Hill, M., Glynn, P., ... & Hughes,

A. (2013). Integrated environmental modeling: a vision and roadmap for the future. Environmental modelling & software, 39, 3-23.

10. Baars, H., & Kemper, H. G. (2008). Management support with structured and

unstructured data—an integrated business intelligence framework. Information systems management, 25(2), 132-148.

11. Fischer‐Kowalski, M., Krausmann, F., Giljum, S., Lutter, S., Mayer, A., Bringezu, S., ... & Weisz, H. (2011). Methodology and indicators of economy‐wide material flow accounting: State of the art and reliability across sources. Journal of Industrial Ecology, 15(6), 855-876.

12. Gade, K. R. (2017). Migrations: Challenges and Best Practices for Migrating Legacy Systems to Cloud-Based Platforms. Innovative Computer Sciences Journal, 3(1).

13. Khan, R. A., & Quadri, S. M. K. (2012). Business intelligence: an integrated approach. Business Intelligence Journal, 5(1), 64-70.

14. Keenan, A. B., Jenkins, S. L., Jagodnik, K. M., Koplev, S., He, E., Torre, D., ... & Pillai, A. (2018). The library of integrated network-based cellular signatures NIH program: system-level cataloging of human cells response to perturbations. Cell systems, 6(1), 13-24.

15. Carletto, C., Zezza, A., & Banerjee, R. (2013). Towards better measurement of household food security: Harmonizing indicators and the role of household surveys. Global food security, 2(1), 30-40.

16. Halog, A., & Manik, Y. (2011). Advancing integrated systems modelling framework for life cycle sustainability assessment. Sustainability, 3(2), 469-499.

17. Gade, K. R. (2019). Data Migration Strategies for Large-Scale Projects in the Cloud for Fintech. Innovative Computer Sciences Journal, 5(1).

18. Gade, K. R. (2017). Integrations: ETL/ELT, Data Integration Challenges, Integration Patterns. Innovative Computer Sciences Journal, 3(1).

Published

08-06-2020

How to Cite

[1]
Muneer Ahmed Salamkar and Karthik Allam, “Data Integration Techniques: Exploring strategies and tools for coordinating data from many sources and systems”, Distrib. Learn. Broad Appl. Sci. Res., vol. 6, pp. 628–651, Jun. 2020, Accessed: Mar. 14, 2025. [Online]. Available: https://dlbasr.org/index.php/publication/article/view/41