Note: this is part 2 of the Science Integrity series
At the AAIC conference in July 2021, Cassava Sciences presented a poster showcasing encouraging results from SavaDx, their novel plasma diagnostic/biomarker to detect Alzheimer’s disease. SavaDx measures changes in levels of a protein named altered filamin A (FLNA) which is linked to AD disease mechanism. The scaffolding of FLNA is also critically related to the amyloid and tau pathologies in AD. In a Phase 2 trial period spanning 28 days, FLNA levels significantly reduced in the two treatment groups of Cassava’s Simufilam group when compared to placebo group. The poster also validated SavaDx results by showing that the treated patients in the same clinical study exhibited significant reductions in plasma P-tau181, a key biomarker known to high correlation with cognitive measures and disease progression.
Since the conference presentation, the SavaDx poster has come under heavy scrutiny for displaying two mutually incongruent plasma P-tau181 data charts – namely spaghetti (bipartite graph) and scatter plots. Some absent data points and a misplaced outlier were highlighted (amongst other complaints) in a citizen petition to the FDA asking for Phase 3 trials of Simufilam to be halted. A supplement to the petition rebukes the p-value claims in the poster with alternate, speculative scenarios of the outlier placement. It has also been a subject of hot debate in social media, with Dr. Elisabeth Bik leading the criticism in the Twitterverse as well as her blog. On Sept 3, Cassava Sciences released a public statement with corrected plasma P-tau181 charts.
In the rest of this article, we will take a deep dive into the extracted raw data and address misconceptions perpetrated on social media platforms (Twitter, Reddit, etc) by relying on independent analysis from two data scientists. We summarize that the underlying plasma Ptau-181 data, incorporating the minor corrections, indicates no deliberate manipulation and, in fact, reinforces the optimism about good cognitive endpoint outcomes expected from upcoming Phase 3 trials.
Validity of Our Extracted Data
Two contributing data scientists of this blog independently extracted raw data values of plasma P-tau181 from the spaghetti and scatter plots (Figures 5 and 4 respectively in the SavaDx poster). The subsequent corrections made in Cassava Sciences’ public statement, i.e insertion of the two missing lines in the spaghetti plot and the outlier removed from all analysis, were incorporated. Note that the corrected 100mg spaghetti plot has 17 data points after the outlier is removed.
Data scientist 1 (DS1) digitally placed fine grained horizontal scales on the image to obtain the relative pixel heights of the data points. Data scientist 2 (DS2) used a web tool to precisely annotate the dots/lines in the poster and obtain the pixel 2D positions.
From figures 1 and 2, it should be immediately clear that the data scientists’ values (extracted independently) closely track each other for the 100mg group. It extends to placebo and 50mg as well. In the Appendix, we publish all our extracted raw values for placebo, 50mg, 100mg for transparency, along with Dr. Bik’s raw data she shared as a screenshot in a tweet. Overall, it establishes high confidence that the raw data extracted by our data scientists represents the ground truth, a core assumption we make. We use the average of the two raw values for each data point for the rest of our analysis.
Oddly enough, Dr. Bik’s raw data (marked EB in figures 1, 2) has only 15 values (doesn’t match with corresponding scatter plot either) for 100mg and deviates more from the ground truth. The divergence is understandable as the online extraction tool used by her may not have spotted all the overlapping lines in the bottom half of the spaghetti plot. However, it does raise some questions about her due diligence in applying the proverbial fine toothed comb here, a quality she is admired for in her scientific critiques shared publicly on Twitter.
Unfounded Manipulation Concerns
In figures 3, 4, and 5 below, we co-plot the sorted % improvements (aka CFB – Comparison from Baseline) obtained from two distinct chart sources in the SavaDx poster. For the scatter plot, we derive them from the pairwise differences between raw Day 1 and 28 plasma P-tau values. The spaghetti chart contains the CFB values directly.
In each of the figures, notice how the lines of sorted CFS values coincide well visually. The ideal invariant property is that the lines must exactly coincide. However, they are expected to diverge slightly due to the inherent inaccuracies involved in our raw extraction process. The visually seen coincidence is supported by low RMSE (root mean square error) and high correlation coefficient between the two lines in each of the graphs. Additionally, an unpaired t-test with null hypothesis that they are from same distribution was run. High p-values (summarized in table 1) failed to reject the null hypothesis though it doesn’t prove the opposite.
|Paired t-test results||p-value||t-value||Degrees of freedom (N-1)|
We assert that this observation eliminates doubts raised about whether the Cassava team modified underlying data for either of the scatter and spaghetti plots. Such tampering would be hard to go unnoticed as ensuring that each plot suits a certain narrative while also managing to achieve the above mentioned invariance is extremely difficult.
Evidence of Significant Treatment Effects
Table 2 highlights the statistical significance of the differences in mean CFB between 50mg vs placebo and 100mg v placebo in the corrected data where the single outlier is removed from all analysis. Statistical significance was measured by unpaired t-test. Note any p-value less than a threshold 0.05 is considered evidence of high statistical significance. Though the mean CFB of 100mg (-16%) isn’t very different from that of 50mg (-15%), when compared to placebo (mean CFB +14%), the 100mg positive treatment effect entails much higher confidence (2.6x lower p-value) compared to 50mg treatment though both are significant.
|Unpaired t-test results relative to Placebo||p-value||t-value||Degrees of freedom (N1 + N2 – 2)|
It is interesting to observe in Figure 6 below that even though placebo’s mean CFB (i.e CFB computed for each patient across 28 days and averaged over) is +14%, the means of the absolute plasma P-tau values at Day 1 and Day 28 (across patients) are similar to each other. We infer this as more indicative of the placebo group not really shifting in its baseline plasma P-tau values. On the other hand, the means of the absolute plasma P-tau values at Day 1 and Day 28 for 50mg and 100mg groups have clear drops indicating real treatment effects. Figure 7 shows a similar pattern for standard deviation. It further indicates positive treatment effects as the 50mg and 100mg groups are exhibiting faster convergence/regression to lower plasma P-tau values. We also include the histogram of the Day 1 and Day 28 plasma P-tau values in figures 8, 9.
Missing Data Points
There has been confusion regarding some missing data points in SavaDx poster’s plasma P-tau charts. The trial study included 64 patients but only 52 plasma P-tau points were plotted across the groups, accounting for 12 absentees. The following excerpt in the SavaDx posted explaining the exclusion criteria seems to have eluded the critics who insinuated that only favourable data points were cherry picked for presentation.
Plasma P-tau181 was measured in duplicate by SIMOA®, a digital ELISA platform. Data with CVs >11% were repeated and excluded if >15% on repeat.
Note that Cassava Sciences acknowledged the error with respect to a couple of missing data points and added them to the Placebo group in the corrected spaghetti plot.
We conclude that the controversy surrounding the plasma P-tau181 data is much ado about nothing. There are no indications of deliberate manipulation. We thank all the critics for prompting independent efforts from our data scientists to not only validate the data but bolster the argument that we can expect positive cognitive endpoint outcomes from upcoming Phase 3 trials.
Averaged plasma P-tau raw values extracted by our data scientists
|Data point index||placebo D1||placebo D28||50mg D1||50mg D28||100mg D1||100mg D28|
Screenshots of Dr. Bik’s tweet about her raw data values and the embedded image.
Screenshot of digitally placed horizontal lines in usage by Data Scientist 1 to extract raw data.
Screenshot of webtool (https://apps.automeris.io/wpd/) in usage by Data Scientist 2 to extract raw data.