Starting From the End — The Detective’s Study Design

Where We Left Off

A week had passed since Dr. Junaid Rashid walked out of the last session with a cold cup of tea and a question he finally knew how to answer.

He had his cross-sectional study mapped out. Hypertensive patients in his OPD. A validated questionnaire. A sampling strategy. Three months, he had said. Doable.

He had been right. And he had not stopped there.

He arrived this evening with his notebook open to a new page and a new question written at the top in the same careful handwriting.

“Why are my patients non-compliant in the first place?”

He read it out before he had even sat down.

“I know the prevalence now,” he said. “Or I will, once I run the study. But that still doesn’t tell me why. What are the actual risk factors? I want to find the causes.”

Dr. Sumaira Talib, already seated, looked up with an expression that suggested she had been thinking about the same problem.

Dr. Hammad Ali was drawing something in his notebook that may or may not have been a flowchart.

Dr. Hassan Raza had his arms crossed in the way he does when he is waiting to see where something is going.

Dr. Muhammad Yaqoob let Dr. Junaid’s question sit in the air for a moment.

“Last time,” Dr. Muhammad Yaqoob said, “I told you the cross-sectional study cannot tell you which came first the exposure or the outcome. You found that out yourself, if I remember correctly.”

“Association, not causation,” he said immediately.

“Good. You remembered. So if you want to get closer to causation, you need a design that can see time. That can say: this exposure happened before this outcome. Tonight, we start with the first design that gives you that. The case-control study.”

The Detective Who Works Backwards

Dr. Muhammad Yaqoob walked to the whiteboard and drew two columns.

On the left: CASES. On the right: CONTROLS.

“Here is the logic,” Dr. Muhammad Yaqoob said. “You already know who got sick. Your cases are the patients who have the disease in Dr. Junaid’s scenario, let’s say the patients who had a hypertensive crisis and were admitted to your ward. Your controls are patients who did not have a crisis similar in age, sex, perhaps from the same hospital, but without the outcome.”

Dr. Muhammad Yaqoob drew arrows pointing backwards from both columns.

“Now you go back in time. You ask both groups: What were you exposed to before this happened? Were you non-compliant with medication? Did you have high salt intake? Did you have uncontrolled stress? Were you a smoker?”

Dr. Muhammad Yaqoob turned around.

“You are starting from the outcome — the disease — and tracing backwards to find the exposure. Most study designs progress over time. The case-control study moves backwards. It is the detective’s design. You already have the body. You are now looking for the cause of death.”

Dr. Hammad Ali stopped drawing and looked up. “So you already know who got sick and who didn’t, and then you look at what they were exposed to in the past?”

“Exactly.”

“That’s… actually clever,” he said, with the mild surprise of someone who had expected to be bored.

The Question the Cases and Controls Can Answer

Dr. Sumaira Talib raised her hand. “Sir, what does the analysis actually give you? What number do you calculate?”

This was the right question to ask, and she knew it.

“The Odds Ratio,” Dr. Muhammad Yaqoob said. He wrote it on the board: OR — Odds Ratio.

Not relative risk. Not prevalence. The Odds Ratio. Because in a case-control study, you are not following a population through time to see who develops the disease, you are selecting people who already have it. Which means you cannot calculate incidence. You cannot calculate risk. What you can calculate is the odds of exposure among your cases compared to the odds of exposure among your controls.”

Dr. Muhammad Yaqoob wrote the formula simply:

OR = (Odds of exposure in cases) ÷ (Odds of exposure in controls)

“An OR of 1 means no association. An OR greater than 1 indicates the exposure is more common among cases, suggesting it may be a risk factor. An OR less than 1 suggests it may be protective.”

Dr. Junaid was writing carefully. “So if my OR for medication non-compliance is 4.5 ”

“Then the odds of having been non-compliant are 4.5 times higher among patients who had a crisis compared to those who did not. That is a meaningful signal. Not proof of causation but a meaningful signal that demands further investigation.”

He underlined something in his notebook.

Why This Design Exists — And When It Is the Only Sensible Choice

Dr. Muhammad Yaqoob sat on the edge of the desk, which is where he tends to end up when the important part of an explanation arrives.

“Tell me,” Dr. Muhammad Yaqoob said to the group, “what would happen if you tried to study a rare disease using a cohort design? Say, a condition that affects one in every ten thousand patients?”

Dr. Hassan Raza answered immediately. “You would need an enormous sample. Tens of thousands of participants, followed for years, before you saw enough cases to analyse.”

“And the cost?”

“Enormous. The time? Even worse.”

“Exactly. This is why case-control studies exist. It was built for exactly this problem. Rare diseases. Conditions with long latency periods, such as diseases that take 10 or 20 years to develop after exposure. You cannot follow people for twenty years and wait. But you can find the people who already have the disease, find a matched group who do not, and look back.”

Dr. Muhammad Yaqoob listed the situations on the board:

Use a case-control study when:

  • The disease is rare
  • The latency period between exposure and outcome is long
  • You want to study multiple risk factors for a single outcome
  • Time and resources are limited
  • You need an answer faster than a cohort study can provide

“Case-control studies gave us the first clear evidence that smoking causes lung cancer,” Dr. Muhammad Yaqoob said. “Richard Doll and Austin Bradford Hill. 1950. They did not follow thousands of smokers for twenty years and watch them develop cancer. They took lung cancer patients — cases — and compared them to patients without lung cancer — controls — and asked about their smoking histories. The odds ratio was so high, so consistent, so impossible to explain away, that it changed medicine permanently.”

The room was quiet.

Dr. Junaid looked at the board for a long moment. “And we can do that. In Pakistan. With the diseases that are here.”

“Yes,” Dr. Muhammad Yaqoob said. “You can.”

The Limitations Nobody Warns You About — Until It Is Too Late

“Now,” Dr. Muhammad Yaqoob said, “the part that will save you from a rejected manuscript.”

He had their full attention.

“Recall bias.”

Dr. Muhammad Yaqoob wrote it on the board.

“You are asking people to remember what they were exposed to in the past. Your cases — the ones who had the crisis — have spent days or weeks in a hospital, thinking about why this happened to them. They are highly motivated to remember everything they did wrong. They will dig deep. They will reconstruct. Sometimes they will over-report exposures because they are looking for an explanation.”

Dr. Muhammad Yaqoob paused.

“Your controls, who are sitting in the waiting room feeling perfectly fine, are less motivated to reconstruct their dietary and medication history from eight months ago. They may under-report. The result? Your cases appear more exposed than they actually were, not because they truly were, but because they remembered better.”

Dr. Sumaira Talib nodded slowly. “So the bias isn’t in the disease. It’s in the memory.”

“It is always in the memory,” Dr. Muhammad Yaqoob said. “This is why validated questionnaires, structured data collection tools, and — where possible — objective records from hospital files matter more in case-control studies than almost any other design. You do not want to be measuring memory. You want to be measuring exposure.”

Dr. Muhammad Yaqoob moved to the second limitation.

“Selection of controls.” He underlined it twice. “This is where most case-control studies quietly fall apart. Your controls must come from the same source population as your cases. If your cases are hospital inpatients, your controls cannot be healthy community volunteers, because healthy community volunteers are, by definition, different from people who end up in hospitals. The comparison becomes meaningless.”

Dr. Hammad Ali looked up. “So where do you get controls?”

“From the same hospital. Patients admitted for unrelated conditions. Sometimes, from the community, if your cases were community-recruited. The rule is: if you had developed the disease, you would have been recruited as a case. That is the standard. Hold yourself to it.”

Dr. Muhammad Yaqoob looked at Dr. Junaid. “What else cannot the case-control study give you?”

He thought for a moment. “Incidence. Prevalence.”

“Correct. You selected your sample based on outcome status. You cannot calculate how many people in the population develop the disease. For that, you need the cohort.”

The Moment Hammad Asked the Wrong Question Correctly

“Sir,” said Hammad, “can’t the OR be confused with relative risk? Like — can’t you just report OR and people will think you’re reporting risk?”

It was the wrong question, framed correctly enough to deserve a full answer.

“Yes,” Dr. Muhammad Yaqoob said. “And this is a real problem in published literature. The Odds Ratio is not the same as the Relative Risk. When the outcome is rare — less than 10% in the population — the OR approximates the Relative Risk reasonably well, and many researchers use them interchangeably in that context [1]. But when the outcome is common, the OR exaggerates the association compared to the relative risk. Reporting an OR of 3.2 for a common outcome and calling it ‘three times the risk’ is technically incorrect and inflates the apparent effect size.”

Dr. Muhammad Yaqoob wrote on the board:

Rare outcome → OR ≈ RR (acceptable approximation) Common outcome → OR overestimates RR (do not conflate)

“Know your formula. Know what it measures. Know when the approximation holds and when it does not. Reviewers know. Editors know. If your paper conflates these and the reviewer catches it, and they will the manuscript comes back.”

What Happened After the Session

Dr. Junaid stayed behind again.

This was becoming a habit, and Dr. Muhammad Yaqoob did not mind it.

“Sir,” he said, “so for my non-compliance study, I already have the cross-sectional question mapped. But now I’m thinking: what if I also want to know which specific risk factors predict a hypertensive crisis in non-compliant patients?”

“Then you have a case-control question,” Dr. Muhammad Yaqoob said.

“Cases would be the non-compliant patients who did have a crisis. Controls would be non-compliant patients who did not. And I look back at what was different between them.”

He said it without prompting. Correctly. Completely.

“Yes,” Dr. Muhammad Yaqoob said.

He was quiet for a moment.

“Fifteen years,” he said, not to Dr Muhammad Yaqoob exactly, but more to the room, or to himself. “Fifteen years I’ve been watching this happen and not knowing how to study it.”

“You know now,” Dr. Muhammad Yaqoob said.

“I know now,” he agreed.

He closed his notebook and picked up his bag. At the door, he turned back.

“Next time is the cohort?”

“Next time is the cohort. The study that moves forward through time — instead of backwards.”

He nodded. “I’ll think about it before I come.”

He always does, now. That is what fifteen years of unasked questions do when it finally finds the right room.

References:

  1. Dettori JR, Norvell DC, Chapman JR. Risks, rates and odds: What’s the difference and why does it matter?. Global Spine Journal. 2021 Sep;11(7):1156-8.
Follow the UPMED Medical Consultancy Channel to stay updated on the 120-post journey of this research series. We will share posts covering all the latest updates and progress. Link: https://whatsapp.com/channel/0029VaCu9r86buMKJD4wx40j

You can also connect with the writer of this blog post series to share or receive suggestions: Dr. Junaid Rashid (Founder of UPMED) at 03042397393 (WhatsApp).

List of all the posts in this journey.
Shopping Cart
Scroll to Top