Scoring Methodology
This page explains how every score on the site is calculated, so you can decide for yourself whether to trust it. Most supplement review sites are affiliate farms with a scoring rubric bolted on after the fact. We built the rubric first.
What follows is exactly how we evaluate every product in our database, so you can check our work instead of taking our word for it. The same standard runs across every product, it does not change from brand to brand, and affiliate commissions never touch the scores.

How affiliate commissions affect our scores
They don't, and it is fair to ask. We do earn a commission when you buy through our links, but every product runs through the same fixed rubric whether we have an affiliate relationship with the brand or not. A lot of our highest-scored products come from brands we earn nothing from.
If we get a score wrong, we fix the page and note the correction. If a supplement does not work, we say so. If a popular product is, in our view, overpriced or underdosed, we say that too. The methodology is here to protect you, not the brands.
The Two-Tier System
When you look at a single supplement score somewhere, it is usually answering two very different questions at once: does this ingredient work, and did this brand make the product well? Mashing them together hides what you actually need to know. A nicely made product with zero clinical evidence ends up looking the same as a sloppily labeled product backed by strong research. Those are not the same problem, and you should not have to untangle them yourself.
So we split them into two scores you can read separately.
Evidence Rating
Does this ingredient actually do what it claims? Rated once per ingredient type. Whether you buy Thorne or Nature Made magnesium glycinate, the underlying clinical evidence is identical. We rate the evidence for the molecule, not the brand.
Execution Score /100
Is this specific product well-made? Rated per brand. Two products can contain the same ingredient and differ enormously in dose accuracy, third-party testing, cost per effective dose, and label honesty. The execution score captures these differences.
Tier 1: Evidence Rating
This is the part that tells you whether the ingredient does anything in the first place. The evidence tier reflects the strength of clinical research supporting an ingredient's primary claimed benefit. Not all studies count the same to us: studies that pool many human trials (systematic reviews and meta-analyses of randomized controlled trials, the kind where people are randomly assigned to the supplement or a dummy pill) carry the most weight, followed by individual such trials, then studies that just observe people, then research done in animals or in a dish.
| Tier | Criteria |
|---|---|
| Strong Evidence | Multiple large RCTs and systematic reviews or meta-analyses confirming benefit. Consistent results across studies with meaningful effect sizes. The evidence base is strong enough that the medical community broadly recognizes the benefit. |
| Likely Effective | Several RCTs showing benefit with generally consistent results. Good evidence base but may have limitations in study design, sample size, or generalizability. The weight of evidence favors benefit. |
| Mixed Evidence | Limited RCTs, mixed or inconsistent results, or primarily observational studies. Evidence is suggestive but not conclusive. Reasonable people looking at the same data could disagree. |
| Weak Evidence | Mostly animal or in-vitro studies. Human data is scarce, poorly designed, or from very small trials. The mechanism is plausible but clinical evidence in humans is insufficient. |
| Ineffective | No meaningful evidence of benefit in humans, or evidence actively shows the supplement does not work for its primary claimed purpose. Marketing claims are not supported by research. |
Evidence ratings are ingredient-specific, not brand-specific. Creatine monohydrate is Tier 1 regardless of who manufactures it. The rating reflects the research on the molecule.
Tier 5 ingredients do not receive brand scores. If the evidence shows an ingredient does not work for its primary claimed purpose, scoring how well a brand manufactured it is beside the point. We note this on the scorecard page and explain why.
A-F Claim Grades
The tier is our one-word verdict on an ingredient. Claim grades go a level deeper, because most supplements are sold on a long list of benefits and they are rarely all backed equally. On each supplement page, the evidence table grades each claim on its own, such as sleep quality, blood pressure, soreness, or immune support, so the one benefit a brand markets hardest cannot quietly stand in for the rest.
| Grade | Level | What it means |
|---|---|---|
| A | Strong | Consistent human evidence, usually including high-quality RCTs or systematic reviews, supports the claim. |
| B | Moderate | Human studies generally point in the same direction, but limitations remain in size, population, or replication. |
| C | Limited | The claim is plausible, but the evidence is mixed, early, indirect, or too small to treat as settled. |
| D | Weak | Support comes mostly from mechanisms, animal studies, observational data, or very thin human trials. |
| F | None | No meaningful human evidence supports the claim, or the available evidence argues against it. |
Claim grades are not product scores. A supplement can have one well-supported claim and several weak claims. A product can also execute well on dose, purity, value, and transparency even when the ingredient's evidence is only moderate.
Tier 2: Execution Score (0-100)
Once you know the ingredient works, the next question is whether this particular bottle is a good way to get it. That is what the execution score answers. Every scored product receives an execution score from 0 to 100, calculated from four equally-weighted pillars. Each pillar is worth up to 25 points.
Dosing & Form (0-25 points)
This is the most number-driven pillar, and the one most likely to catch a product that looks fine on the shelf. We line up what the bottle actually gives you against what the clinical research says you need to see an effect.
Dose adequacy (up to 17 points)
We calculate the ratio of the product's dose per recommended serving to the minimum clinically effective dose established in published RCTs. Products delivering 100% or more of the clinical dose receive full points. Products that underdose receive proportionally fewer points. A product at 80% of the clinical dose scores less than one at 100%. A product at 40% scores less still. This is arithmetic, not subjective judgment.
Form quality (up to 8 points)
Not all forms of an ingredient absorb equally. Magnesium glycinate absorbs significantly better than magnesium oxide. Methylcobalamin is more readily utilized than cyanocobalamin for most people. We check whether the product uses the most bioavailable form supported by absorption research. Products using optimal forms receive full points. Products using inferior forms receive fewer.
Purity Verification (0-25 points)
Supplements are loosely regulated, so the label's promise is not the same as proof. An outside lab checking the actual product (third-party testing) is the only independent way to confirm a supplement contains what its label claims and is free from contaminants, heavy metals, and undeclared ingredients. We score based on how strong that outside check is.
| Verification Level | Score Range |
|---|---|
| USP Verified or NSF Certified for Sport | Highest |
| ConsumerLab approved, BSCG certified, or Informed Choice verified | High |
| GMP certified facility, no product-level testing | Moderate |
| No third-party verification | Low |
| Failed third-party testing or known contamination | Near-zero |
One distinction is worth keeping straight, because brands lean on it. Facility-level GMP certification means the factory follows baseline manufacturing standards. Product-level third-party testing means someone actually opened the specific bottle you are buying and checked what is inside. The second is the one that protects you.
Value (0-25 points)
Here is where the sticker price lies to you. We calculate cost per clinically effective daily dose. Not cost per pill. Not cost per serving. This is the only cost number worth comparing, and it is the metric that separates this site from most supplement review sites.
Say a product costs $0.10 per pill. If you need three pills a day to reach the dose that actually worked in trials, you are really paying $0.30 a day. A rival that costs $0.25 per pill but delivers the full dose in one capsule is the cheaper one once you do that math. The pill that looks pricey on the shelf wins where it counts. Most comparison sites never catch this, because they stop at the price on the label.
How we calculate cost per effective dose
1. Identify the clinically effective daily dose from published RCTs and meta-analyses.
2. Determine how many servings of the product are needed to reach that dose.
3. Divide the product price by the number of servings in the container.
4. Multiply by the number of servings needed per day.
Example: Ashwagandha KSM-66
Clinical dose: 600mg/day
Product: 300mg/capsule, 90 capsules, $19.99
Servings needed: 600 / 300 = 2 capsules/day
Cost per serving: $19.99 / 90 = $0.222
Cost per effective dose: $0.222 x 2 = $0.44/day
We score each product against others in its own category, so the most cost-effective per effective dose come out on top. A cheap sticker price will not save a product that is so underdosed you could never reasonably reach the effective dose: if it does not deliver enough active ingredient to work, there is no real value to weigh, and it scores at the bottom.
Transparency (0-25 points)
The simplest version of this pillar: can you tell exactly what you are getting just by reading the label?
Full ingredient disclosure with specific forms identified, no proprietary blends, and third-party certification prominently displayed: highest scores. Minor gaps such as an unspecified source but clearly stated form: slightly lower. Proprietary blends that hide individual ingredient amounts: lowest scores.
When a label says “Proprietary Blend: 500mg” across five ingredients, you have no way to know whether the one you came for is 400mg or 5mg, and that gap is the difference between a working dose and a sprinkle. In our view, proprietary blends serve the brand's margins, not you, so we score them down.
Interpreting Execution Scores
| Score | Label | What it means |
|---|---|---|
| 85-100 | Excellent | Top-tier across all four pillars. Clinically dosed, independently tested, cost-effective, and fully transparent. Recommended without reservation. |
| 70-84 | Good | Strong performance with minor gaps in one area. Solid products that most consumers should feel confident purchasing. |
| 55-69 | Fair | Meaningful compromises in one or more areas. Acceptable, but better options usually exist in the same category. |
| 40-54 | Poor | Significant issues across multiple pillars. Generally not recommended when better-scoring alternatives are available. |
| Below 40 | Very Poor | Major problems in dosing, testing, value, or transparency. Not recommended. |
Our Data Sources
NIH Dietary Supplement Label Database
207,000+ verified product labels for ingredient verification
NIH Office of Dietary Supplements
Evidence summaries per nutrient from the federal government
PubMed
Systematic reviews and meta-analyses for evidence tier assignments
Examine.com
Research summaries used as starting points, always verified against primary sources
USP Verified Product List
Gold standard for third-party quality verification
NSF Certified for Sport Database
Quality verification trusted by professional athletes and organizations
ConsumerLab.com
Publicly available pass/fail results for product testing
BSCG Certified Database
Banned substance testing and quality verification
Informed Choice Database
Sports supplement testing and certification
Amazon & Brand Websites
Current pricing data, verified regularly and timestamped
Our Research Workflow
If you want to see the whole thing end to end, here is the same process we run for every supplement we cover:
- 1
Assign the evidence tier. Review systematic reviews, meta-analyses, and the best available RCTs to determine the strength of clinical evidence for the ingredient's primary claimed benefit.
- 2
Grade each individual claim A-F so the page separates strong, mixed, weak, and unsupported benefit claims instead of flattening them into one verdict.
- 3
Identify the clinically effective dose and optimal forms from the research literature.
- 4
Pull the top 8-15 products from Amazon bestsellers, recommended brands, and reader requests.
- 5
For each product: record ingredients, doses, and forms from the label (DSLD or brand site). Check third-party testing status against USP, NSF, ConsumerLab, BSCG, and Informed Choice databases. Record current price and servings per container. Calculate cost per clinically effective dose. Assess label transparency.
- 6
Score each product on the four-pillar execution rubric to produce a score out of 100.
- 7
Write the evidence summary from primary research, rank products by execution score, and publish the complete scorecard page.