Listen to the Clinical Chemistry Podcast
C. Stephan, K. Siemssen, H. Cammann, F. Friedersdorff, S. Deger, M. Schrader, K. Miller, M. Lein, K. Jung, H.A. Meyer, . Between-Method Differences in Prostate-Specific Antigen Assays Affect Prostate Cancer Risk Prediction by Nomograms Clin Chem 2011; 57: 995–1004.
Dr. Carsten Stephan is a Urologist at the UniversityHospital Charité, in Berlin, Germany.
This is a podcast from Clinical Chemistry. I am Bob Barrett.
Detecting prostate cancer relies on the measurement of prostate-specific antigen or PSA concentration. An increased PSA value is associated with a higher probability of having prostate cancer. However, benign prostate hyperplasia or prostatitis can also cause increases in serum PSA. Of all the molecular variables involving the total PSA concentration, only the free PSA percentage is clinically relevant and capable of avoiding unnecessary biopsies. Yet the low specificity of PSA and free PSA percentage remains problematic.
Multivariate models such as artificial neural networks or logistic regression-based nomograms improve prostate cancer risk prediction by combining total PSA, free PSA percentage, and several other factors. Including the percent-free PSA in nomograms has improved the accuracy of prostate cancer diagnoses and the nomograms show an improvement in specificity.
But these prediction models were developed with problematic data, no one has analyzed whether the use of different PSA and free PSA assays has an effect on nomogram-based prostate cancer prediction. In a paper published in the July issue of Clinical Chemistry, Dr. Carsten Stephan from the Department of Urology at the University Hospital Charité, in Berlin, Germany and his team evaluated the effect of assay-dependent variation in PSA and free PSA percentage values on nomogram-based prostate cancer prediction. Dr. Stephan is our guest in this podcast.
Doctor, there are several papers published on this subject, dating back to 2006 that site data from just one patient group that performed five different PSA assays simultaneously, why is that?
Yes, so it might be looked that it’s the same but every publication we did on this group had a lot of question. In the first publication, there we had a look at the data on nearly 600 patients and their PSA range is 0-10 only and we checked what is the difference between PSA and percent-free PSA, this was actually also published in 2006 in ‘Clinical Chemistry’, and then in the second paper, we looked at the clinical consequences. So what were the cut-offs or sensitivity, specificity in another paper in 2007, but then all 800 patients.
And the third, a year later, we took these data and there is an artificial neural network, which considers these PSA and percent-free PSA differences to look who has to undergo a biopsy, and the paper right now clearly applied these data to five different available nomograms and had to look what were the clinical consequences within these nomograms.
All these differences is due to the population and not due to nomograms used, how is this an analytical problem?
Yeah, so we have to say, you have to look at the pre-analytical, analytical, and the population. So the pre- analytical data, there I can only tell you that all samples were run on the same conditions, so everything was run parallel in the shortest possible time so that we can really exclude preanalytical problems.
Samples were found on the day of the measurements and the two assays were run on one day and three assays were run on the next day, and regarding the analytical problems there was a very low imprecision, it was below 8%, and of course, all analyses were done by a single experienced person.
And regarding the next problem, the population, I can only say well, we had a referred population and not a screening population, but I think that the behavior of the population regarding no matter if there were almost 60% of prostate cancer patients that this population would not change. All we obtain differences between the PSA assays because these differences are only due to PSA and free PSA and the nomograms.
According to the models used in the study, there is evidence that some data are missing, why are artificial neural networks missing from the data?
Yeah, our selection purpose was only the availability of nomograms. So all nomograms which could be used online were chosen and we did not include artificial neural networks since there are only very few available online. So I’ll ask one for instance one ANN for repeat biopsy, which was published in 2009 from University of Cambridge by Rochester et al and they built this ANN with 86 patients from where only 27 were cancer patients, and they validated, their ANN was only 23 patients and only 7 cancer patients.
So I cannot take a network like this, and also other good publications from good publications from Babaian, Djavan and Remzi. They are nice publications but these models are all not online available, and therefore we only choose available nomograms.
Your study reveals a large difference in prostate cancer probabilities, is this typical, and what do you subsequently recommend when using a nomogram?
Yeah, these differences are really typical than known differences we’ve got, and differences in prostate cancer probabilities. Of course we know that for instance the Abbott assay usually has the lowest PSA, and Immulite assay has the highest PSA. Difference is also with percent of free PSA; there you have sometimes the other way around.
So one cannot implement a specific assay to prefer since sometimes our assay data were the same, but the nomogram was worse than with another assay, that’s another problem.
So we recommend that the developers should simply provide the name of the assay and the clinician must definitely consider this. My wish is to have more transparency from clinical chemist to clinicians since the reality outside is sometimes very strange. I have an example where I just called the lab to request if they use still the same test, and they said, well, today we right now changed exactly our test from this to this. And then I called urologist, who was sending the data from a patient, he said, well, I didn’t know that, and thanks for the information. So I would like to say here, it’s necessary to have transparency from the laboratories to the clinicians definitely.
Well, at first glance, data between assays within one nomogram are quite similar, so why should we look at calibration where the differences appear large? Is it really necessary to use this somewhat complicated calibration?
Yeah, I mean it’s complicated but it's true that the RFC analogists alone provides first of course, similarly life, and this is only the problem. Its differences will be overlooked on the insufficient comparison towards like the median comparison or the RFC analogists without any cutoffs here. That's a big problem.
For instance, if you look in the publication, it's a supplement on Table 1, it’s visible that at 95% sensitivity, the specificities for the five assays do not differ that that much. For instance, nomogram 4, the specificities range from 37 to 43%, that's not a big difference. But the prostate cancer probability at this point already differs between 29% and already 50%.
So I would like to say with the last year’s, RFC analysts has been used with its cutoffs, which is clinically important. And especially the calibration where we look, what is the difference between observed and predicted prostate cancer cases that is calculated between you and within groups of 20 patients.
And especially this calibration has become more-and- more important, and it’s almost a standard when comparing markers or models with an existing standard, for instance with PSA, free PSA or with another nomogram, and you should see figure 2A. The difference between those models without percent of free PSA is low, but the overall performance shows that a two-fold higher observed than predicted prostate cancer rate. So we overestimate prostate cancer.
On the other hand if you look at nomogram 5 in the figure 2E, there are substantial differences between the PSA assays, but the Intraclass Correlation Coefficient (ICC), this is a measure how close the curve is to the 45 degree line. The ICC is always between 0.9 and 1, so this indicates an excellent relationship between observed and predicted prostate cancer cases. And this is important that you look at this and not only check the RFC value and the median values.
Well after seeing your divergent results between the PSA assays and prostate cancer probabilities, what is the ideal way to use nomograms or similar models in the future?
Yeah, I think it’s impossible right now to provide one solution for everybody. But with this, all our clinical data, we want simply to make aware all nomogram users and also the clinical chemists who are responsible for PSA and free PSA values that despite of harmonization of the PSA assays, we all want to have the WHO standard, there are still large PSA and also percent-free PSA differences.
And that these differences are most likely the cause for all further differences like prostate cancer risks and model, like collaboration differences etcetera, etcetera, and with our ANN program for five assays, which is already online since 2008 we already provide a helpful hint to improve these prostate cancer diagnoses before prostate biopsy.
And I believe that it’s generally better to use models for prostate cancer diagnoses which incorporates for instance PSA, percent-free PSA, prostate volume age, and our data like pro-PSA or urinary PCA3 instead of PSA or percent-free PSA alone.
The conservation of these used PSA assays is so far only possible with our ANN program, prostate at last. Here the user can choose between five different assays. But hopefully I think in the future that nomogram models will also consider PSA differences. This would be my biggest list so far.
Dr. Carsten Stephan is a Urologist at the University Hospital Charité, in Berlin, Germany and has been our guest in this podcast from Clinical Chemistry. I am Bob Barrett, thanks for listening!