Development and validation of a computerized South Asian names and group recognition algorithm (SANGRA) for use in British health-related studies

97Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

Abstract

Background: Studies on ethnic variations in health have played an important role in aetiological and health services research. Most routine datasets, however, do not include information on ethnicity. South Asians, one of the largest minority ethnic groups in Britain, have distinctive names that also allow differentiation of the main sub-groups with their important differences in health-related exposures and disease risks. Methods: A computerized name recognition algorithm (SANGRA) was developed incorporating directories of South Asian first names and surnames together with their religious and linguistic origin. SANGRA was validated using health-related data with self-ascribed information on ethnicity. Results: SANGRA was successful in recognizing South Asian origin in reference datasets, with sensitivity of 89-96 per cent, specificity of 94-98 per cent, positive predictive value (PPV) of 80-89 per cent and negative predictive value (NPV) of 98-99 per cent. Religious origin was correctly assigned in the majority of cases: sensitivity, specificity and PPV were 94 per cent, 91 per cent and 90 per cent for Hindus; 90 per cent, 99 per cent and 98 per cent for Muslims; and 76 per cent, 99 per cent and 94 per cent for Sikhs. SANGRA correctly identified 76 per cent Gujerati and 70 per cent Punjabi names, although only 62 per cent of Gujerati names were sufficiently distinct to be allocated to the Gujerati-only category and only 53 per cent Punjabi names were allocated to the Punjabi-only category. However, specificity and PPV were high for both languages (respectively 97 per cent and 93 per cent for Gujerati, and 99 per cent and 97 per cent for Punjabi). Conclusions: SANGRA provides a practical and valid method of ascertaining South Asian origin by name and, to a lesser degree of accuracy, of differentiating between the main religious and linguistic subgroups living in Britain. This algorithm will be useful in health-related studies where information on self-ascribed ethnicity is not available or is of a limited nature.

References Powered by Scopus

Association of early-onset coronary heart disease in South Asian men with glucose intolerance and hyperinsulinemia

312Citations
N/AReaders
Get full text

Relationship of glucose intolerance and hyperinsulinaemia to body fat pattern in South Asians and Europeans

242Citations
N/AReaders
Get full text

Patterns of mortality among migrants to England and Wales from the Indian subcontinent

170Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Panethnic differences in blood pressure in Europe: A systematic review and meta-analysis

983Citations
N/AReaders
Get full text

Use of geocoding and surname analysis to estimate race and ethnicity

207Citations
N/AReaders
Get full text

Surname lists to identify South Asian and Chinese ethnicity from secondary data in Ontario, Canada: A validation study

190Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Nanchahal, K., Mangtani, P., Alston, M., & Dos Santos Silva, I. (2001). Development and validation of a computerized South Asian names and group recognition algorithm (SANGRA) for use in British health-related studies. Journal of Public Health Medicine, 23(4), 278–285. https://doi.org/10.1093/pubmed/23.4.278

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 14

54%

Researcher 6

23%

Professor / Associate Prof. 4

15%

Lecturer / Post doc 2

8%

Readers' Discipline

Tooltip

Medicine and Dentistry 8

42%

Social Sciences 7

37%

Agricultural and Biological Sciences 2

11%

Psychology 2

11%

Save time finding and organizing research with Mendeley

Sign up for free