Risk, sometimes also referred to as cumulative incidence, is an indicator of the proportion of persons within a specified population who develop the outcome of interest (all persons under consideration must be free of the outcome of interest at the beginning), within a defined time period.

R = New cases / Persons at risk = A/N

where R is the estimated risk; A is the number of new instances of the outcome of interest, often described as new cases; and N is the number of unaffected persons at the beginning of the observation period. It is important to emphasize that at the outset, all persons under consideration must be free of the outcome of interest. The risk of developing the outcome then can range anywhere between 0 and 1. For simplicity, risk often is presented as a percentage by multiplying the proportion by 100.

Example: Vekeman and colleagues were interested in the risk of VTE after total hip or knee arthroplasty and whether the use of anticoagulants to prevent VTEs might induce an unacceptable number of episodes of serious bleeding. Through a large national database, the investigators were able to identify more than 820,000 inpatient hospital stays for adults age 18 years or older who underwent one of these procedures between 2000 and 2008. A total of 8042 VTEs were observed during these hospital stays. The risk of a VTE among total hip or knee replacement admissions, therefore, is:

R = 8042/820,197 = 0.0098 = 0.98%


The proportion of persons within a population who have the condition of interest is referred to as prevalence. Sometimes we designate this proportion further as relating to a specific point in time (point prevalence) or alternately, to a particular time period (period prevalence). The prevalence is calculated by dividing the number of affected persons (cases) by the number of persons in the source population.

P = C/N

where P is the prevalence, C is the number of cases, and N is the size of the source population. As with risk, prevalence can range from 0 to 1. We can also express prevalence as a percentage, by multiplying by 100.

Example: Deitelzweig and colleagues were interested in estimating the prevalence of VTE in the United States. For that purpose, they accessed a database that combined commercial insurance claims with those of medicare beneficiaries for the 5-year period 2002 to 2006. The source population of these databases included 12.7 million persons. Of these persons, 200,007 had VTE, so the 5-year period prevalence was:

P = 200,007/12.7 million = 0.016 = 1.6%

The investigators calculated the 5-year period prevalence separately for DVT, PE, and both DVT and PE. The annual prevalence of VTE was observed to increase progressively over the 5-year study period, with a low of 0.32% in 2002, rising to a high of 0.42% in 2006.

Incidence Rate

The incidence rate measures the rapidity with which newly diagnosed cases of a disease develop. To estimate the incidence rate, one follows a source population of unaffected person over time, counting the number of individuals who become newly affected (cases), and expresses it relative to person-time, which is a combination of the size of the source population and the time period of observation.

The quantification of person-time may seem a little confusing at first, so let us explore how it is calculated. The goal is to estimate the total amount of disease-free time that subjects in the source population are observed. For example, an individual who is followed for  1 year without developing the condition of interest contributes 1 year of observation. Another person may develop the condition of interest 6 months into the study. Although this individual may be followed for a full year, he or she only contributes a half year of disease-free observation, which then can be summed over all persons in the source population, yielding a total person-time of observation. Then, we can calculate the incidence rate as:


where IR is incidence rate, A is the number of newly diagnosed occurrences of the condition of interest, and PT is the total amount of disease-free observation within the source population.

Example: To estimate the incidence rate of VTE in the Canadian province of Quebec, Tagalakis and colleagues accessed health care administrative databases to identify all new cases of DVT or PE between 2000 and 2009. The overall incidence of VTE was found to be:

IR = 91,761 cases/74,297,764 person-year = 0.00124 cases/person-year

To express the incidence rate with fewer decimal places, it is convenient to convert it to 1.24 cases/1000 person-years of observation. An equivalent expression would be 124 cases per 100,000 person-years of observation. In other words, among residents of the province of Quebec, during the decade of 2000 to 2009, the overall incidence rate of newly diagnosed VTEs was a little more than one 1000 persons followed for 1 year.

It is important to note the the incidence rate relates to the first occurrence of the disease or condition of interest. VTE is a disorder that can recur, so if all episodes of VTE in a population (both initial and recurrent) are counted, the estimate of the VTE incidence rate will be inflated. To avoid this problem, the investigator must be able to exclude prior diagnoses of VTE when identify incident cases.


For disease, such as VTE, that can have serious impacts on an affected person’s well-being, we may wish to characterize the likelihood of remaining alive, or survival, after a diagnosis. Mathematically, we would measure survival as:

S = (A – D)/A

where A is the number of newly diagnosed patients with the condition of interest and D is the number of deaths. Survival is, therefore, a proportion that can range from 0 to 1. We can convert survival to a percentage by multiplying by 100. It is important to recognize that survival is a time-dependent phenomenon, therefore, it is essential to specify a time period for the measurement of survival, such as the 30-day survival, or the 1-year survival.

Example: In the study by Tagalakis and colleagues of VTE in Quebec province, patients were followed for survival after their initial diagnosis. Among the 33,447 persons with a PE, there were 5654 deaths within the first 30 days after diagnosis. The 30-day survival, therefore, is calculated as:

S = (33,447 – 5654)/33,447 = 0.83 = 83%


Another measure  of prognosis after a diagnosis is the case-fatality. This metric refers to the proportion (or percentage) of persons with a particular condition who die within a specified period of time. Often, case-fatality is incorrectly referred to as a rate or ratio, but it is more accurately described as a risk or probability. It is calculated mathematically as:

CF = Number of deaths / Number of diagnosed persons = D/A

where CF is case-fatality, D is the number of deaths, and A is the number of persons with the condition of interest at the beginning. The case-fatality can range from 0 when there are no deaths observed during the specified timeframe to 1 when all affected persons (with the diagnosis[es]) of interest die during the specified timeframe.