Scoring Methodology
Most supplement review sites are affiliate farms with a scoring rubric bolted on after the fact. We built the rubric first.
Here is exactly how we evaluate every product in our database. The methodology is standardized, reproducible, and independent of any brand relationship. Affiliate commissions never touch the scores.

The Two-Tier System
Traditional supplement scores combine two fundamentally different questions into one grade: does this ingredient work, and did this brand make the product well? That produces misleading results. A well-manufactured product with zero clinical evidence gets the same grade as a poorly-labeled product backed by strong research. They are not the same problem.
We separate them into two independent scores.
Evidence Rating
Does this ingredient actually do what it claims? Rated once per ingredient type. Whether you buy Thorne or Nature Made magnesium glycinate, the underlying clinical evidence is identical. We rate the evidence for the molecule, not the brand.
Execution Score /100
Is this specific product well-made? Rated per brand. Two products can contain the same ingredient and differ enormously in dose accuracy, third-party testing, cost per effective dose, and label honesty. The execution score captures these differences.
Tier 1: Evidence Rating
The evidence tier reflects the strength of clinical research supporting an ingredient's primary claimed benefit. We base this on the hierarchy of evidence: systematic reviews and meta-analyses of randomized controlled trials (RCTs) carry the most weight, followed by individual RCTs, then observational studies, then animal and in-vitro research.
| Tier | Criteria |
|---|---|
| Strong Evidence | Multiple large RCTs and systematic reviews or meta-analyses confirming benefit. Consistent results across studies with meaningful effect sizes. The evidence base is strong enough that the medical community broadly recognizes the benefit. |
| Likely Effective | Several RCTs showing benefit with generally consistent results. Good evidence base but may have limitations in study design, sample size, or generalizability. The weight of evidence favors benefit. |
| Mixed Evidence | Limited RCTs, mixed or inconsistent results, or primarily observational studies. Evidence is suggestive but not conclusive. Reasonable people looking at the same data could disagree. |
| Weak Evidence | Mostly animal or in-vitro studies. Human data is scarce, poorly designed, or from very small trials. The mechanism is plausible but clinical evidence in humans is insufficient. |
| Ineffective | No meaningful evidence of benefit in humans, or evidence actively shows the supplement does not work for its primary claimed purpose. Marketing claims are not supported by research. |
Evidence ratings are ingredient-specific, not brand-specific. Creatine monohydrate is Tier 1 regardless of who manufactures it. The rating reflects the research on the molecule.
Tier 5 ingredients do not receive brand scores. If the evidence shows an ingredient does not work for its primary claimed purpose, scoring how well a brand manufactured it is beside the point. We note this on the scorecard page and explain why.
A-F Claim Grades
The tier is the top-level judgment for an ingredient. Claim grades are more granular. On each supplement page, the evidence table grades individual claims such as sleep quality, blood pressure, soreness, or immune support so one over-marketed benefit does not hide the rest of the evidence.
| Grade | Level | What it means |
|---|---|---|
| A | Strong | Consistent human evidence, usually including high-quality RCTs or systematic reviews, supports the claim. |
| B | Moderate | Human studies generally point in the same direction, but limitations remain in size, population, or replication. |
| C | Limited | The claim is plausible, but the evidence is mixed, early, indirect, or too small to treat as settled. |
| D | Weak | Support comes mostly from mechanisms, animal studies, observational data, or very thin human trials. |
| F | None | No meaningful human evidence supports the claim, or the available evidence argues against it. |
Claim grades are not product scores. A supplement can have one well-supported claim and several weak claims. A product can also execute well on dose, purity, value, and transparency even when the ingredient's evidence is only moderate.
Tier 2: Execution Score (0-100)
Every scored product receives an execution score from 0 to 100, calculated from four equally-weighted pillars. Each pillar is worth up to 25 points.
Dosing & Form (0-25 points)
The most quantitative pillar. We compare what the product delivers to what the clinical research says you need.
Dose adequacy (up to 17 points)
We calculate the ratio of the product's dose per recommended serving to the minimum clinically effective dose established in published RCTs. Products delivering 100% or more of the clinical dose receive full points. Products that underdose receive proportionally fewer points. A product at 80% of the clinical dose scores less than one at 100%. A product at 40% scores less still. This is arithmetic, not subjective judgment.
Form quality (up to 8 points)
Not all forms of an ingredient absorb equally. Magnesium glycinate absorbs significantly better than magnesium oxide. Methylcobalamin is more readily utilized than cyanocobalamin for most people. We check whether the product uses the most bioavailable form supported by absorption research. Products using optimal forms receive full points. Products using inferior forms receive fewer.
Purity Verification (0-25 points)
Third-party testing is the only independent verification that a supplement contains what its label claims and is free from contaminants, heavy metals, and undeclared ingredients. We score based on the tier of verification.
| Verification Level | Score Range |
|---|---|
| USP Verified or NSF Certified for Sport | Highest |
| ConsumerLab approved, BSCG certified, or Informed Choice verified | High |
| GMP certified facility, no product-level testing | Moderate |
| No third-party verification | Low |
| Failed third-party testing or known contamination | Near-zero |
The distinction between facility-level GMP certification and product-level third-party testing matters. GMP means the manufacturing process meets baseline standards. Third-party product testing means someone actually checked what is in the specific bottle you are buying.
Value (0-25 points)
We calculate cost per clinically effective daily dose. Not cost per pill. Not cost per serving. This is the only cost metric that matters, and it is the single metric that separates this site from most supplement review sites.
A product might cost $0.10 per pill, but if you need three pills to reach the clinically effective dose, your actual cost is $0.30 per day. Another product costs $0.25 per pill but delivers the full clinical dose in one capsule. The expensive-looking pill is actually cheaper where it counts. Most comparison sites miss this entirely because they compare sticker prices.
How we calculate cost per effective dose
1. Identify the clinically effective daily dose from published RCTs and meta-analyses.
2. Determine how many servings of the product are needed to reach that dose.
3. Divide the product price by the number of servings in the container.
4. Multiply by the number of servings needed per day.
Example: Ashwagandha KSM-66
Clinical dose: 600mg/day
Product: 300mg/capsule, 90 capsules, $19.99
Servings needed: 600 / 300 = 2 capsules/day
Cost per serving: $19.99 / 90 = $0.222
Cost per effective dose: $0.222 x 2 = $0.44/day
Products are scored relative to their category peers. The most cost-effective products per effective dose score highest. Products so underdosed that reaching the effective dose is impractical score lowest regardless of sticker price - you cannot calculate meaningful value for a product that does not deliver enough active ingredient to work.
Transparency (0-25 points)
Does the label tell you exactly what you are getting?
Full ingredient disclosure with specific forms identified, no proprietary blends, and third-party certification prominently displayed: highest scores. Minor gaps such as an unspecified source but clearly stated form: slightly lower. Proprietary blends that hide individual ingredient amounts: lowest scores.
When a label says “Proprietary Blend: 500mg” containing five ingredients, you have no idea whether the active ingredient is 400mg or 5mg. Proprietary blends exist to protect profit margins, not to protect consumers. We score accordingly.
Interpreting Execution Scores
| Score | Label | What it means |
|---|---|---|
| 85-100 | Excellent | Top-tier across all four pillars. Clinically dosed, independently tested, cost-effective, and fully transparent. Recommended without reservation. |
| 70-84 | Good | Strong performance with minor gaps in one area. Solid products that most consumers should feel confident purchasing. |
| 55-69 | Fair | Meaningful compromises in one or more areas. Acceptable, but better options usually exist in the same category. |
| 40-54 | Poor | Significant issues across multiple pillars. Generally not recommended when better-scoring alternatives are available. |
| Below 40 | Very Poor | Major problems in dosing, testing, value, or transparency. Not recommended. |
Our Data Sources
NIH Dietary Supplement Label Database
207,000+ verified product labels for ingredient verification
NIH Office of Dietary Supplements
Evidence summaries per nutrient from the federal government
PubMed
Systematic reviews and meta-analyses for evidence tier assignments
Examine.com
Research summaries used as starting points, always verified against primary sources
USP Verified Product List
Gold standard for third-party quality verification
NSF Certified for Sport Database
Quality verification trusted by professional athletes and organizations
ConsumerLab.com
Publicly available pass/fail results for product testing
BSCG Certified Database
Banned substance testing and quality verification
Informed Choice Database
Sports supplement testing and certification
Amazon & Brand Websites
Current pricing data, verified regularly and timestamped
Our Research Workflow
For every supplement type we cover, we follow a standardized process:
- 1
Assign the evidence tier. Review systematic reviews, meta-analyses, and the best available RCTs to determine the strength of clinical evidence for the ingredient's primary claimed benefit.
- 2
Grade each individual claim A-F so the page separates strong, mixed, weak, and unsupported benefit claims instead of flattening them into one verdict.
- 3
Identify the clinically effective dose and optimal forms from the research literature.
- 4
Pull the top 8-15 products from Amazon bestsellers, recommended brands, and reader requests.
- 5
For each product: record ingredients, doses, and forms from the label (DSLD or brand site). Check third-party testing status against USP, NSF, ConsumerLab, BSCG, and Informed Choice databases. Record current price and servings per container. Calculate cost per clinically effective dose. Assess label transparency.
- 6
Score each product on the four-pillar execution rubric to produce a score out of 100.
- 7
Write the evidence summary from primary research, rank products by execution score, and publish the complete scorecard page.
Our Independence Guarantee
Affiliate relationships never influence scores. We earn commissions when you purchase through our links, but every product is scored using the same rubric regardless of whether we have an affiliate relationship with the brand. Many of our highest-scored products are from brands we have no affiliate relationship with at all.
If we discover an error in our scoring, we update the page and note the correction. If a supplement does not work, we say so. If a popular product is overpriced or underdosed, we say that too. The methodology exists to protect readers, not brands.