data collection - OpenMD.com Journal Search

Mechanical Turk data collection in addiction research: utility, concerns and best practices.

Addiction (Abingdon, England) Oct 2020

Amazon Mechanical Turk (MTurk) provides a crowdsourcing platform for the engagement of potential research participants with data collection instruments. This review (1)...

Summary PubMed Full Text PDF

Authors: Alexandra M Mellis, Warren K Bickel

AIMS

Amazon Mechanical Turk (MTurk) provides a crowdsourcing platform for the engagement of potential research participants with data collection instruments. This review (1) provides an introduction to the mechanics and validity of MTurk research; (2) gives examples of MTurk research; and (3) discusses current limitations and best practices in MTurk research.

METHODS

We review four use cases of MTurk for research relevant to addictions: (1) the development of novel measures, (2) testing interventions, (3) the collection of longitudinal use data to determine the feasibility of longer-term studies of substance use and (4) the completion of large batteries of assessments to characterize the relationships between measured constructs. We review concerns with the platform, ways of mitigating these and important information to include when presenting findings.

RESULTS

MTurk has proved to be a useful source of data for behavioral science more broadly, with specific applications to addiction science. However, it is still not appropriate for all use cases, such as population-level inference. To live up to the potential of highly transparent, reproducible science from MTurk, researchers should clearly report inclusion/exclusion criteria, data quality checks and reasons for excluding collected data, how and when data were collected and both targeted and actual participant compensation.

CONCLUSIONS

Although on-line survey research is not a substitute for random sampling or clinical recruitment, the Mechanical Turk community of both participants and researchers has developed multiple tools to promote data quality, fairness and rigor. Overall, Mechanical Turk has provided a useful source of convenience samples despite its limitations and has demonstrated utility in the engagement of relevant groups for addiction science.

Topics: Behavior, Addictive; Behavioral Research; Crowdsourcing; Data Accuracy; Data Collection; Humans; Patient Selection

PubMed: 32135574
DOI: 10.1111/add.15032

Use of Wearable, Mobile, and Sensor Technology in Cancer Clinical Trials.

JCO Clinical Cancer Informatics Dec 2018

As the availability and sophistication of mobile health (mHealth) technology (wearables, mobile technology, and sensors) continues to increase, there is great promise... (Review)

Summary PubMed Full Text

Review

Authors: Suzanne M Cox, Ashley Lane, Samuel L Volchenboum...

As the availability and sophistication of mobile health (mHealth) technology (wearables, mobile technology, and sensors) continues to increase, there is great promise that these tools will be transformative for clinical trials and drug development. This review provides an overview of the current landscape of potential measurement options, including the various types of data collected, methods/tools for collecting them, and a crosswalk of available options. The opportunities and potential drawbacks of mHealth in cancer clinical trials are discussed. Specific concerns related to data accuracy, provenance, and regulatory issues are highlighted, with suggestions for how to address these in future research. Next steps for establishing mHealth methods and tools as legitimate and accepted measures in oncology clinical trials include continuation of regulatory definition by the FDA; establishment of security standards and protocols; refinement and implementation of methods to establish and document data accuracy; and finally, creation of feedback loops wherein regulators receive updates from researchers with better and more timely data, which should decrease trial times and lessen drug development costs. Implementing mHealth technologies into cancer clinical trials has the potential to transform and propel oncology drug development and precision medicine to keep pace with the rapidly increasing developments in genomics and immunology.

Topics: Biomedical Technology; Clinical Trials as Topic; Data Collection; Humans; Neoplasms; Remote Sensing Technology; Smartphone; Telemedicine; Wearable Electronic Devices

PubMed: 30652590
DOI: 10.1200/CCI.17.00147

A patient and family data domain collection framework for identifying disparities in pediatrics: results from the pediatric health equity collaborative.

BMC Pediatrics Jan 2018

By 2020, the child population is projected to have more racial and ethnic minorities make up the majority of the populations and health care organizations will need to...

Summary PubMed Full Text PDF

Authors: Aswita Tan-McGrory, Caroline Bennett-AbuAyyash, Stephanie Gee...

BACKGROUND

By 2020, the child population is projected to have more racial and ethnic minorities make up the majority of the populations and health care organizations will need to have a system in place that collects accurate and reliable demographic data in order to monitor disparities. The goals of this group were to establish sample practices, approaches and lessons learned with regard to race, ethnicity, language, and other demographic data collection in pediatric care setting.

METHODS

A panel of 16 research and clinical professional experts working in 10 pediatric care delivery systems in the US and Canada convened twice in person for 3-day consensus development meetings and met multiple times via conference calls over a two year period. Current evidence on adult demographic data collection was systematically reviewed and unique aspects of data collection in the pediatric setting were outlined. Human centered design methods were utilized to facilitate theme development, facilitate constructive and innovative discussion, and generate consensus.

RESULTS

Group consensus determined six final data collection domains: 1) caregivers, 2) race and ethnicity, 3) language, 4) sexual orientation and gender identity, 5) disability, and 6) social determinants of health. For each domain, the group defined the domain, established a rational for collection, identified the unique challenges for data collection in a pediatric setting, and developed sample practices which are based on the experience of the members as a starting point to allow for customization unique to each health care organization. Several unique challenges in the pediatric setting across all domains include: data collection on caregivers, determining an age at which it is appropriate to collect data from the patient, collecting and updating data at multiple points across the lifespan, the limits of the electronic health record, and determining the purpose of the data collection before implementation.

CONCLUSIONS

There is no single approach that will work for all organizations when collecting race, ethnicity, language and other social determinants of health data. Each organization will need to tailor their data collection based on the population they serve, the financial resources available, and the capacity of the electronic health record.

Topics: Canada; Data Collection; Disability Evaluation; Electronic Health Records; Ethnicity; Gender Identity; Health Equity; Healthcare Disparities; Humans; Language; Minority Groups; Pediatrics; Racial Groups; Sexual Behavior; Social Determinants of Health; United States

PubMed: 29385988
DOI: 10.1186/s12887-018-0993-2

An informatics framework for the standardized collection and analysis of medication data in networked research.

Journal of Biomedical Informatics Dec 2014

Medication exposure is an important variable in virtually all clinical research, yet there is great variation in how the data are collected, coded, and analyzed. Coding... (Review)

Summary PubMed Full Text

Review

Authors: Rachel L Richesson

Medication exposure is an important variable in virtually all clinical research, yet there is great variation in how the data are collected, coded, and analyzed. Coding and classification systems for medication data are heterogeneous in structure, and there is little guidance for implementing them, especially in large research networks and multi-site trials. Current practices for handling medication data in clinical trials have emerged from the requirements and limitations of paper-based data collection, but there are now many electronic tools to enable the collection and analysis of medication data. This paper reviews approaches to coding medication data in multi-site research contexts, and proposes a framework for the classification, reporting, and analysis of medication data. The framework can be used to develop tools for classifying medications in coded data sets to support context appropriate, explicit, and reproducible data analyses by researchers and secondary users in virtually all clinical research domains.

Topics: Biomedical Research; Clinical Coding; Clinical Trials as Topic; Computational Biology; Data Collection; Pharmaceutical Preparations; Terminology as Topic

PubMed: 24434192
DOI: 10.1016/j.jbi.2014.01.002

National health accounts data from 1996 to 2010: a systematic review.

Bulletin of the World Health... Aug 2015

To collect, compile and evaluate publicly available national health accounts (NHA) reports produced worldwide between 1996 and 2010. (Review)

Summary PubMed Full Text PDF

Review

Authors: Anthony L Bui, Rouselle F Lavado, Elizabeth K Johnson...

OBJECTIVE

To collect, compile and evaluate publicly available national health accounts (NHA) reports produced worldwide between 1996 and 2010.

METHODS

We downloaded country-generated NHA reports from the World Health Organization global health expenditure database and the Organisation for Economic Co-operation and Development (OECD) StatExtract website. We also obtained reports from Abt Associates, through contacts in individual countries and through an online search. We compiled data in the four main types used in these reports: (i) financing source; (ii) financing agent; (iii) health function; and (iv) health provider. We combined and adjusted data to conform with OECD's first edition of A system of health accounts manual, (2000).

FINDINGS

We identified 872 NHA reports from 117 countries containing a total of 2936 matrices for the four data types. Most countries did not provide complete health expenditure data: only 252 of the 872 reports contained data in all four types. Thirty-eight countries reported an average not-specified-by-kind value greater than 20% for all data types and years. Some countries reported substantial year-on-year changes in both the level and composition of health expenditure that were probably produced by data-generation processes. All study data are publicly available at http://vizhub.healthdata.org/nha/.

CONCLUSION

Data from NHA reports on health expenditure are often incomplete and, in some cases, of questionable quality. Better data would help finance ministries allocate resources to health systems, assist health ministries in allocating capital within the health sector and enable researchers to make accurate comparisons between health systems.

Topics: Data Collection; Data Interpretation, Statistical; Databases, Factual; Global Health; Health Expenditures; Humans; World Health Organization

PubMed: 26478614
DOI: 10.2471/BLT.14.145235

Structured methodology review identified seven (RETREAT) criteria for selecting qualitative evidence synthesis approaches.

Journal of Clinical Epidemiology Jul 2018

To compare and contrast different methods of qualitative evidence synthesis (QES) against criteria identified from the literature and to map their attributes to inform... (Comparative Study)

Summary PubMed Full Text

Comparative Study Review

Authors: Andrew Booth, Jane Noyes, Kate Flemming...

OBJECTIVE

To compare and contrast different methods of qualitative evidence synthesis (QES) against criteria identified from the literature and to map their attributes to inform selection of the most appropriate QES method to answer research questions addressed by qualitative research.

STUDY DESIGN AND SETTING

Electronic databases, citation searching, and a study register were used to identify studies reporting QES methods. Attributes compiled from 26 methodological papers (2001-2014) were used as a framework for data extraction. Data were extracted into summary tables by one reviewer and then considered within the author team.

RESULTS

We identified seven considerations determining choice of methods from the methodological literature, encapsulated within the mnemonic Review question-Epistemology-Time/Timescale-Resources-Expertise-Audience and purpose-Type of data. We mapped 15 different published QES methods against these seven criteria. The final framework focuses on stand-alone QES methods but may also hold potential when integrating quantitative and qualitative data.

CONCLUSION

These findings offer a contemporary perspective as a conceptual basis for future empirical investigation of the advantages and disadvantages of different methods of QES. It is hoped that this will inform appropriate selection of QES approaches.

Topics: Data Collection; Evidence-Based Medicine; Qualitative Research; Systematic Reviews as Topic

PubMed: 29548841
DOI: 10.1016/j.jclinepi.2018.03.003

Electronic data collection in a multi-site population-based survey: EN-INDEPTH study.

Population Health Metrics Feb 2021

Electronic data collection is increasingly used for household surveys, but factors influencing design and implementation have not been widely studied. The Every...

Summary PubMed Full Text PDF

Authors: Sanne M Thysen, Charlotte Tawiah, Hannah Blencowe...

BACKGROUND

Electronic data collection is increasingly used for household surveys, but factors influencing design and implementation have not been widely studied. The Every Newborn-INDEPTH (EN-INDEPTH) study was a multi-site survey using electronic data collection in five INDEPTH health and demographic surveillance system sites.

METHODS

We described experiences and learning involved in the design and implementation of the EN-INDEPTH survey, and undertook six focus group discussions with field and research team to explore their experiences. Thematic analyses were conducted in NVivo12 using an iterative process guided by a priori themes.

RESULTS

Five steps of the process of selecting, adapting and implementing electronic data collection in the EN-INDEPTH study are described. Firstly, we reviewed possible electronic data collection platforms, and selected the World Bank's Survey Solutions® as the most suited for the EN-INDEPTH study. Secondly, the survey questionnaire was coded and translated into local languages, and further context-specific adaptations were made. Thirdly, data collectors were selected and trained using standardised manual. Training varied between 4.5 and 10 days. Fourthly, instruments were piloted in the field and the questionnaires finalised. During data collection, data collectors appreciated the built-in skip patterns and error messages. Internet connection unreliability was a challenge, especially for data synchronisation. For the fifth and final step, data management and analyses, it was considered that data quality was higher and less time was spent on data cleaning. The possibility to use paradata to analyse survey timing and corrections was valued. Synchronisation and data transfer should be given special consideration.

CONCLUSION

We synthesised experiences using electronic data collection in a multi-site household survey, including perceived advantages and challenges. Our recommendations for others considering electronic data collection include ensuring adaptations of tools to local context, piloting/refining the questionnaire in one site first, buying power banks to mitigate against power interruption and paying attention to issues such as GPS tracking and synchronisation, particularly in settings with poor internet connectivity.

Topics: Data Accuracy; Electronics; Humans; Infant, Newborn; Surveys and Questionnaires

PubMed: 33557855
DOI: 10.1186/s12963-020-00226-z

Data and Model Biases in Social Media Analyses: A Case Study of COVID-19 Tweets.

AMIA ... Annual Symposium Proceedings.... 2021

During the coronavirus disease pandemic (COVID-19), social media platforms such as Twitter have become a venue for individuals, health professionals, and government...

Summary PubMed Full Text PDF

Authors: Yunpeng Zhao, Pengfei Yin, Yongqiu Li...

During the coronavirus disease pandemic (COVID-19), social media platforms such as Twitter have become a venue for individuals, health professionals, and government agencies to share COVID-19 information. Twitter has been a popular source of data for researchers, especially for public health studies. However, the use of Twitter data for research also has drawbacks and barriers. Biases appear everywhere from data collection methods to modeling approaches, and those biases have not been systematically assessed. In this study, we examined six different data collection methods and three different machine learning (ML) models-commonly used in social media analysis-to assess data collection bias and measure ML models' sensitivity to data collection bias. We showed that (1) publicly available Twitter data collection endpoints with appropriate strategies can collect data that is reasonably representative of the Twitter universe; and (2) careful examinations of ML models' sensitivity to data collection bias are critical.

Topics: Bias; COVID-19; Data Collection; Humans; Machine Learning; Social Media

PubMed: 35308985
DOI: No ID Found

Crowdsourcing Samples in Cognitive Science.

Trends in Cognitive Sciences Oct 2017

Crowdsourcing data collection from research participants recruited from online labor markets is now common in cognitive science. We review who is in the crowd and who... (Review)

Summary PubMed Full Text

Review

Authors: Neil Stewart, Jesse Chandler, Gabriele Paolacci...

Crowdsourcing data collection from research participants recruited from online labor markets is now common in cognitive science. We review who is in the crowd and who can be reached by the average laboratory. We discuss reproducibility and review some recent methodological innovations for online experiments. We consider the design of research studies and arising ethical issues. We review how to code experiments for the web, what is known about video and audio presentation, and the measurement of reaction times. We close with comments about the high levels of experience of many participants and an emerging tragedy of the commons.

Topics: Cognitive Science; Crowdsourcing; Data Collection; Humans; Reproducibility of Results

PubMed: 28803699
DOI: 10.1016/j.tics.2017.06.007

Use of public datasets in the examination of multimorbidity: Opportunities and challenges.

Mechanisms of Ageing and Development Sep 2020

The interrogation of established, large-scale datasets presents great opportunities in health data science for the linkage and mining of potentially disparate resources... (Review)

Summary PubMed Full Text

Review

Authors: Christopher Boulton, J Mark Wilkinson

The interrogation of established, large-scale datasets presents great opportunities in health data science for the linkage and mining of potentially disparate resources to create new knowledge in a fast and cost-efficient manner. The number of datasets that can be queried in the field of multimorbidity is vast, ranging from national administrative and audit datasets, large clinical, technical and biological cohorts, through to more bespoke data collections made available by individual organisations and laboratories. However, with these opportunities also come technical and regulatory challenges that require an informed approach. In this review, we outline the potential benefits of using previously collected data as a vehicle for research activity. We illustrate the added value of combining potentially disparate datasets to find answers to novel questions in the field. We focus on the legal, governance and logistical considerations required to hold and analyse data acquired from disparate sources and outline some of the solutions to these challenges. We discuss the infrastructure resources required and the essential considerations in data curation and informatics management, and briefly discuss some of the analysis approaches currently used.

Topics: Data Collection; Datasets as Topic; Humans; Multimorbidity; Public Health Informatics

PubMed: 32622995
DOI: 10.1016/j.mad.2020.111310