Browse Ask a Question
Tools
Rss Categories

De-identification Practices

1 Are Canadians identifiable by their age, gender, and residence forward sortation area ?

For many studies the combination of age, gender, and residence Forward Sortation Area (FSA) are collected. Also, in many datasets that are disclosed these three variables are included. Does that represent a privacy risk ? In one of our studies we analyzed…

2 Can a person be re-identified from their diagnosis code ?

In many discussions about re-identification risk and de-identification the focus is on demographic variables. But many data sets also include diagnosis codes (for example, ICD-10 codes). We will answer the question on whether these can be used for re-identification…

3 Can a voter list be used for re-identification ?

A lot of literature makes the point that voter lists can be used for re-identification. However, the accuracy of this statement will depend on your jurisdiction. In the US many states make their voter lists available for free or for a small fee. Often there…

4 Can individuals be re-identified from disease maps ?

Increasingly, public health units, the media, and researchers are publishing or posting maps on the web showing locations of individuals with particular diseases. Do these maps represent a high re-identification risk ? There have been studies showing that…

5 Can postal codes re-identify individuals ?

Postal codes are the smallest geographic unit that is used by Canada Post to deliver mail. In a health care context they are the most common geographic unit because that is what patients know and are able to provide. Therefore it is often collected. The re-identification…

6 Categories of variables in a data set - a re-identification risk management perspective

It is important to be able to categorize variables in a data set according to their role in re-identification because it helps us reason about risk. Below is one categorization that we have found useful. The specific scenario we are looking at is that of…

7 Definition of identifiable dataset - if a person can find their record(s) in the dataset

One question that sometimes comes up is whether a data set can be considered identifiable if a person can find their own record(s) in there. This definition can be analyzed from a number of different perspectives. A person may not know if they are in a data…

8 How can I de-identify longitudinal records ?

At the outset, it is important to make a distinction between three types of longitudinal records that occur often in practice. The first type consists of specific variables that are collected from all patients at specific points in time. For example, if function…

9 How can I safely release data to multiple researchers - scenario I ?

First, let's consider the scenario. We have a data custodian who wants to disclose data to researcher A and researcher B. Each researcher will get a different set of variables. But the two data sets pertain to the same individuals/patients. This is a rather…

10 How can I safely release data to multiple researchers - scenario II ?

Under this scenario, a data custodian is providing data to two researchers, A and B. The two data sets have some overlapping variables, for example, they may both have the patients' date of birth and postal codes. Also, the two data sets have no directly…

1 2 3 Next
Info Ask a Question