Wednesday, 6 May 2026

SUMMARY OF DATA COLLECTION AND REPOSITORIES

 



Data collection refers to a systematic and organised process through which information is gathered from different sources to support analysis, decision-making, and the development of meaningful insights (Pavlov et al., 2022). It forms the initial phase of the data lifecycle, where raw information is captured before being processed, structured, and stored for further use. This activity may involve direct approaches such as interviews, questionnaires, observations, and focus group discussions, or indirect approaches that rely on existing records, institutional reports, databases, and published materials. Increasingly, organisations are adopting digital technologies such as electronic systems and mobile applications to enhance data accuracy, reduce errors, and improve collection speed. Overall, data collection is essential because it generates evidence that supports informed decisions, strengthens planning processes, and improves monitoring and evaluation systems (Rafferty & Fuchs, 2021).

Data repositories are structured digital environments designed for the storage, organisation, and long-term preservation of data. They include systems such as relational databases, data warehouses, cloud-based storage platforms, digital libraries, and institutional information systems like electronic health record platforms. Their primary role is to ensure that data is securely maintained, systematically organised, and readily accessible whenever required for analysis, reporting, or sharing. In addition, repositories contribute to maintaining data integrity, enabling interoperability between systems, and supporting sustainable data preservation practices (Kindling et al., 2020; Heidorn, 2021).

The relationship between data collection and data repositories is interdependent. Once data is gathered, it undergoes processes such as validation, cleaning, and transformation before being stored in a repository for structured access and long-term utilisation. The effectiveness of any repository is largely influenced by the quality of the data collected, while proper storage systems enhance the reliability and usefulness of that data. In this way, data collection acts as the foundation for generating information, whereas repositories provide the infrastructure required to convert raw data into meaningful and usable knowledge (Fox et al., 2020).

However, both processes are affected by several operational and technical challenges. These include poor-quality data characterised by inconsistencies or missing values, inadequate technological infrastructure, limited system integration between collection and storage platforms, and increasing concerns regarding data privacy and security. Furthermore, a shortage of skilled personnel in data management can weaken governance structures and reduce overall system efficiency (Oliveira et al., 2021).

In conclusion, data collection and data repositories play a central role in modern information systems. While data collection ensures that relevant and reliable information is generated, repositories ensure that such information is securely stored, well organised, and easily retrievable. Together, these processes enhance organisational decision-making, improve operational performance, and support effective research and service delivery.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

REFERENCES

Fox, G. C., Hey, T., & Trefethen, A. E. (2020). The data lifecycle in scientific data

           management. Philosophical Transactions of the Royal Society A, 378(2166), 20190072.

Heidorn, P. B. (2021). The emerging role of data repositories in research ecosystems. Data

            Science Journal, 20(1), 1–12.

Kindling, M., et al. (2020). Data repositories and data sharing practices in research

             infrastructures. International Journal of Digital Curation, 15(1), 1–18.

Oliveira, M., et al. (2021). Challenges in data governance and management in digital systems.

              Information Systems Frontiers, 23(5), 1203–1218.

Pavlov, P., Kauffman, R. J., & Yeo, B. (2022). Big data and data collection strategies in

               organizations. MIS Quarterly Executive, 21(2), 89–104.

Rafferty, A. E., & Fuchs, C. (2021). Evidence-based data practices in modern organizations.

                Journal of Information Science, 47(6), 789–804.

15 comments:

DATA STORAGE

  Data storage in data curation refers to the structured and systematic management of digital information to ensure that it is securely pres...