Design
The study was performed in two parts. In the first part, 18 patients undergoing upper abdominal ultrasound as outpatients for various indications were randomly selected; the scratch test was performed by two raters independently and followed by the ultrasound.
In the second part of the study, the two raters independently performed the scratch test on separate randomly selected outpatients (15 patients by rater 1, and 16 patients by rater 2), followed by upper abdominal ultrasound.
Scratch test procedure
The scratch test was performed by marking a point on the right costal margin at the midclavicular line (point A in Figure 1). This point was used as a reference to take the measurements of liver span below the costal margin. The diaphragm of the stethoscope was placed on the xiphisternum (point C in Figure 1). Light transverse strokes of the skin with a single finger, parallel to the suspected liver edge, were made advancing from the right lower quadrant along the midclavicular line to the costal margin. When the hepatic edge was reached (point B1 in Figure 1), the scratching sound was transmitted through the solid liver with the resultant sudden increase in auscultated sound intensity; the sound intensity continued to increase until it was maximal (point B2) and this point was taken as the best estimate of the liver edge. The distance between this point and point A (distance AB2) was recorded on a data sheet. The sonographers used the same reference point to measure distance to the liver edge (Point A) but were blinded to the value obtained by the clinical raters. The results were recorded in centimeters between the right costal margin (RCM) and the liver edge.
The two raters (AD, KG) were senior medical registrars in the department of general medicine. Before the study, 2 calibration sessions with a consultant (JA) were performed to standardise the method, i.e. where to place the stethoscope, where to scratch, whether to listen for the point of the start of sound transmission or the point of maximal sound transmission, etc. The two raters also used the same brand of stethoscope. To ensure that the sound transmission was not purely through the skin, we measured a control point below the xiphisternum using the same stroking technique (point D in Figure 1). This point was the point of maximal sound intensity when stroked with the finger ascending upwards towards the stethoscope from below the umbilicus. The distance CD was measured in centimeters and compared with the distance B2C. If B2C was more than CD, then it was assumed that the transmission of sound heard at point B2 was through liver. AB2 was then measured as liver span below the RCM. However, if B2C was less than or equal to CD, then we assumed that the sound conduction was likely due to skin conduction and that the liver edge did not extend beyond the RCM. No clinical information about the subjects and no other methods of physical examination were performed in the study in order not to bias the interpretation of the scratch test; in particular, palpation of the liver edge was not performed.
Informed, written consent was obtained from all patients and the study was approved by the Hunter New England Area Ethics committee.
Ultrasound procedure
The ultrasound was performed with a Phillips iU22 Ultrasound Machine (Koninklijke Philips Electronics N.V., Netherlands) using a 5-2 MHz curved array transducer with the default abdominal preset. Time Gain Compensation (TGC) curves were adjusted to optimize the image quality if required. A single focal zone was set to the mid liver parenchyma. Harmonics were off. Patients were asked to hold their breath during the ultrasound exam but not during the scratch test in order to mimic usual clinical practice.
Statistical analysis
The co-primary outcomes were agreement between the 2 observers as measured using the intra-class correlation coefficient (ICC) and agreement between each observer and the USG reference standard as measured using Spearman’s correlation coefficient (rho), which is the non-parametric equivalent of Pearson’s coefficient.
Secondary outcomes included the degree and source of disagreement including:
-
Bland-Altman plots comparing the difference between each rater and the USG (on the y-axis) compared to the USG (on the x-axis).
-
the proportion of rater values that lie within 1, 2 or 3 cm of the reference value.
-
whether the absolute value of the distance or the subject’s body mass index (BMI) influenced the degree of error between the clinical observer and the USG using linear regression.
Threshold p-value for significance was taken as <0.05.