BRCA

This dataset consists of the results of 987 screening mammograms administered at the Group Health Cooperative in the state of Washington during the year 2002. Five radiologists were selected at random from those who read lots of mammograms in this cooperative. For each of these radiologists, approximately 200 of the mammograms they read were selected at random. Recorded for each mammogram is a numeric code (1-999) identifying the radiologist who read it, along with two outcomes.

One outcome is an indicator of whether or not there was a breast cancer diagnosis within 12 months following the screening mammogram (1=Yes, 0=No); the second is an indicator of whether or not the subject is recalled for further diagnostic testing (1= recall for further diagnostic testing, 0="normal"). In addition, several risk factors identified in previous studies, are provided; referent values for a "typical female" are indicated by asterisks:

AGE 40-49*, 50-59, 60-69, 70 and older
FAMILY HISTORY OF BREAST CANCER 0=No*, 1=Yes
HISTORY OF BREAST BIOPSY/SURGERY 0=No*, 1=Yes
BREAST CANCER SYMPTOMS 0=No*, 1=Yes
MENOPAUSE/HORMONE THERAPY STATUS Pre-menopausal, Post-menopausal & no HT, Post-menopausal & HT*, Post-menopausal & unknown HT
PREVIOUS MAMMOGRAM 0=No*, 1=Yes
BREAST DENSITY CLASSIFICATION 1=Almost entirely fatty, 2=Scattered fibroglandular tissue*, 3=Heterogenously dense, 4=Extremely dense

All entries are numeric. For risk factors with just two levels, the referent level is represented by zero and the alternative by one. For risk factors having more than two levels, the referent level is specified and columns are presented only for incidence of the nonreferent levels. Thus in all cases the referent level would have all columns 0's.

Column coding is as follows:

Col 1 -- radiologist ID (1--999)
Col 2 -- cancer outcome (1=Cancer, 0=Normal)
Col 3 -- recall outcome (1=Y 0=N)
Col 4 -- intercept (always 1)
Col 5 -- patient age 50-59 (the referent is patient age is 40-49)
Col 6 -- patient age 60-69
Col 7 -- patient age 70+
Col 8 -- family history of breast cancer (1=Y 0=N)
Col 9 -- history of breast biopsy/surgery (1=Y 0=N)
Col 10 -- breast cancer symptoms (lump/nipple discharge; 1=Y 0=N)
Col 11 -- pre-menopause (referent is post menopause with hormone treatment)
Col 12 -- post-menopause, no hormone treatment
Col 13 -- post-menopause, unknown hormone treatment
Col 14 -- patient had a previous mammogram
Col 15 -- breast density 1 (breast density 2 is the referent)
>Col 16 -- breast density 3
Col 17 -- breast density 4

File:
Download BRCA (txt - 33.74 KB)