(Posted by David AuBuchon)
Though not part of our core objectives, EMR has one proposed cosmetic trial. Aside from a desire to provide cheap and effective solutions for consumers, we will likely pursue this trial before all others for a couple of reasons. One is that sale of products used in a hopefully positive study could raise funds for other research. Another reason is that the interventions are relatively safe and simple in comparison to other more complex interventions that we will be getting into in other contexts. This makes it a good first option for allowing a new grassroots organization to prove its competency in conducting clinical trials.
Design of credible cosmetic trials is notoriously challenging. Problems frequently include small sample size, semi-quantitative outcomes with either low or unknown inter and intra-rater reliabilities, unknown correlation between semi-quantitative outcomes and other methods of objective measurements, nonstandardized methods of taking photos, difficulty in controlling for concurrent product use, no control over length of photographed hair, and infeasibility of subject/investigator/rater blinding. New imaging and measurement technologies are making some of these problems easier to address. We wish to conduct a credible cosmetic trial that leverages objective measurement technologies, and that includes a number of creative design elements that to the best of our knowledge have never been considered.
All constructive criticism is invited. Please put on your thinking caps. There is no timeframe at present for initiating this trial. We have to manufacture and document the needed products before anything else.
If you need access to full texts of studies I cite in this proposal, just enter the DOI number into Sci-Hub.1,2 Here is also a useful background paper on designing cosmetic clinical trials.3 If you wish to follow discussion of this proposed trial, subscribe to the comment thread at the bottom of the page.
- Real-World Blinded Rating System
- That a combination of four at-home interventions can reverse hair loss, increase hair thickness and prominence, increase eyelash length and prominence, increase nail thickness, and reduce the apparent age of the face.
- That a novel controlled “real-world” rating system can provide reliable assessment of multiple cosmetic outcomes, with high inter and intra-rater reliabilities and high correlations between multiple outcome measurement approaches, without the need of a control group.
- That a new “Hair Prominence Assessment” scale will meet validation criteria.
There are 4 proposed simultaneous interventions:
- Microneedling (dermarolling/dermastamping): Use of microneedling on the scalp has been studied to treat hair loss and many consumers report excellent results, particularly when combined with a hair growth serum. However, trials have been too short to offer adequate chance of achieving maximum benefit or complete remission. In fact, no cap on the potential benefit of microneedling has ever been documented. Also, consumer feedback suggests that more frequent at-home use with smaller needles (potentially as often as every day) may be more effective than infrequent treatments with very large needles, typically done by a professional. We wish to test this. This would have the advantages of avoiding expensive professional visits and further minimizing the minor bleeding that occurs with longer needles. Men will use dermarolling. Women will use dermastamping on account of dermarollers catching on long hair and causing it to tangle and break.
- A hair loss serum: We are formulating an aqueous topical hair loss serum containing a variety of nutrients and ingredients that may play roles in treating hair loss. Many promising ingredients are plant-derived oils, which unfortunately can stain pillow cases, leave undesirable residues in hair, or cause the scalp to smell unpleasant. These may adversely affect compliance and hence we will try to avoid having an oil-phase in the product if possible. The serum will be applied before microneedling (on days when both are done) to increase penetration.
- A facial serum: Tentative ingredients are MSM, 20 proteinogenic amino acids, hydrolyzed elastin, soluble collagen, vitamin C, vitamin E, ferulic acid, allantoin, silver citrate, and Olivem 1000. All of these have a rationale for improving skin.
- An oral supplement containing MSM (3 grams) and 20 proteinogenic amino acids (totaling 3 grams): Orally, collagen supplements and a number of individual amino acids have been shown to likely reduce wrinkles and improve appearance of skin. Similarly, MSM, cysteine, and gelatin supplements have all been shown to likely treat hair loss. Collectively, this suggests that optimal delivery of all proteinogenic amino acids will likely have marked benefit for both hair and skin. Hair is virtually 100% amino acids by mass, with cysteine only comprising about 15% of it. Research has shown that when 8 essential amino acids are supplemented on empty stomach in precise ratios that they 99% follow anabolic pathways. This has been a long-overlooked discovery which we wish to use to help guide our formulation to optimize synthesis of protein in hair and skin.
The amino acid/MSM supplement is a powder taken in water once a day on empty stomach. The hair serum is a liquid applied to the scalp (like minoxidil) once a day. The facial serum is a relatively light lotion applied to the face once a day. The dermaroller/dermastamp is used on all areas of thinning or receding hair at an initial frequency of once per week. Frequency is increased or decreased according to the desire of the subject, as comfort and convenience allows, up to an ideal of once per day. Initial needle length is 0.5mm. As the scalp may become increasingly resilient, needle length may be increased. Men can work up to as high as 1.0mm needles, as desired. Women can work up to as high as 1.5mm needles as desired. Women may require longer needles as their hair may provide additional padding on the scalp that needs to be penetrated.
The amino acids/MSM supplement , hair serum, and the facial serum will all be ramped up to the full dose over the course of the first week of the trial. After remaining at the full dose of these 3 products for 1 week, microneedling will be introduced.
Subjects will be advised to discard and replace their dermarollers at 1-month intervals and their dermastamps at 2-month intervals, if they have not already. These are probably conservative replacement times.
The formulations, rationales, and uses of these products will be discussed in greater detail some other time. The focus of this page is on trial design.
Blinded Rater Design
Tentative trial duration is one year. We will enroll 50 active treatment subjects, evenly split between men and women. There will be no control group. Randomization is not sensible because blinding is infeasible, as there is no way to administer a sham dermaroller/dermastamp treatment. If such unblinded subjects were randomized to the no-treatment group, dropout rate and/or confounding use of other products may be very high, as people wouldn’t want to wait around and do nothing about their aging process. Though the presence of a nonrandom (voluntary) control group could be helpful (as photographic raters could still remain blinded to group allocation if shown a mix of treated and untreated subjects), our proposed methods of control are probably superior to this option. For all subjective rating, as we will see, raters will in effect be blinded by our novel design. It is elaborated further down. In short, voluntary raters are shown before and/or after photographs online but are blinded to chronology (or at times even to the fact that there is such a thing as chronology). They at times are also led to believe untreated controls are mixed within the photos, when in reality they are not.
Benefits of Enrolling
Subjects have the opportunity to help further scientific knowledge, and may receive cosmetic benefits from the treatments. If funds are available, trial subjects will receive free products for the 1-year duration of the trial. However, this is uncertain.
Inclusion and Exclusion Criteria
Proposed inclusion criteria are as follows:
- Must consider yourself as having some degree of hair loss that “involves the topmost part of the head”. (See Figure 1 of this study.4 “Top” is defined by drawing an invisible upward connecting line between ear canals.)
- Must be at least 30 years old
- Must have access to a desktop computer
- Must have a willingness to follow through with the combination treatments for 1 year
- If male, must be willing to have a standardized haircut at the start and end of the study
- Must be willing to correspond periodically with the investigators
- Must be willing to follow up in person at the start and end of the study
- Must not have done any hair loss treatments within the last 6 months.
- Other than the ones being tested, must not start any other products or treatments for improving or managing hair loss, hair appearance, or facial skin appearance during the study
- Must make a good faith effort to discontinue all current products or treatments for improving hair appearance or facial skin appearance during the study. However, bare minimum products or treatments which subjects are not comfortable stopping may be continued.
- Must not change hair color during the study.
- Must be willing to permit use of photographs for any research, scientific, and/or marketing purposes. (Permission to use photographs on any online medium hosted by Earth Med Research can be rescinded at any time upon request. Photos published in scientific journals cannot be rescinded.)
- Must not discuss the trial online until the results are published.
Proposed exclusion criteria are as follows:
- Having any diagnosed chronic condition affecting the skin of the face or scalp, except hair loss and/or acne.
- History of psoriasis or lichen planus
- Having hair loss that is considered temporary (i.e. telogen effluvium), due to trichotillomania (i.e. pulling your own hair out), or due to cancer treatments.
- Having facial hair such as beard or mustache
- Having any facial scarring, with the exception of acne scars
- Having previously received of any facial toxins, fillers, or surgeries
Subjects will meet with investigators in person at the start and the end of the study. Before the first meeting, subjects will have had to inform investigators which baseline products they have decided to stop and which they have decided to continue during the trial. Investigators will correspond with subjects and advise them on further discontinuing anything remaining which may be unnecessary. After reaching an agreement, subjects will go through a product washout period of at least three days before meeting in person.
Men will also be required to receive a uniform haircut within a week of their initial intake. The entire head should be trimmed with one chosen attachment length, no longer than 1 inch. The length of the haircut must be reported.
Initial intake will describe the study, check the exclusion/inclusion criteria, provide instructions on treatment use and adverse event reporting, and collect basic data such as name, contact information, age, ethnicity, history of medical diagnoses (including if type of hair loss has been medically specified), and history of medications. Men will report the length of their haircut. Informed consent forms will be administered based on the WHO’s template.5
Likewise, before meeting in person again at the end of the study, subjects will stop most interventions and go through washout periods of 1 week for microneedling, and two days for the hair loss and facial serums. Men will again be required to have received a uniform haircut within the last 1 week, of the same attachment length reported at the start of the study.
The following data will be collected at the meetings at both the start and end of the study:
- Trichoscopic photos: There are cameras that allow hair on the scalp to be viewed at many-fold magnification. Such photos can then be analyzed by software to compute things like hair density (number of hairs on the scalp per square cm), hair thickness (diameter of individual hairs), and several other things. FotoFinder seems to be a leading supplier of such equipment.6 They also offer lab services.7 Trichoscopic photos taken with their system can be uploaded to their lab, and analyzed professionally for a fee, which eliminates the need to learn such analysis for oneself. Their systems seem expensive, and I am skeptical we will be able to justify the price of their products and services over other options. There is another company called TrikoScope.8 They charge $300 for a camera that attaches to a smart phone that can take such photos. And I gather their software services cost about $80 a month. And we would only likely need the software for a few months during the study period. Thus, TrikoScope likely represents the cheapest option for trichoscopic photography and analysis. I have not confirmed the capabilities of their software to calculate hair thickness (which is a primary desire), but it is probably capable. One downside is that their system does not seem to have been used in peer-reviewed research. But by the time we actually get to this study, that may change. Another option that has been used in peer-reviewed literature is Folliscope.4,9–11 And another is TrichoScan.12–15 Whatever system is used, determination of repeatable measurement sites on the head is needed. Here is an example of a study that defines a method of identification of sites to take photos on the back, sides, front, and top of the head.4 Trichoscopy systems may also differ in that some require hair to be clipped and dyed at a site before taking a photo, whereas some do not.
- Measurements of thumbnails: Digital calipers that can measure down to 0.01mm have been used to measure the thickness of fingernails.16 These cost roughly $150. Both thumbnails will be measured.
- Measurements of eyelashes: Length of eyelashes can likewise be measured using a digital caliper.17–20 The longest single eyelash on the upper eyelids of both the right and left eyes will be measured. However, before taking measurements, eyelashes will be straightened by using a heated eyelash curler, as demonstrated in this video.21
- Photos of hair loss: Hair loss will be photographed using standardized methods. Here is an example of a standardized setup that researchers have used.14 A rough guess is that such a setup might be constructed for perhaps $1,000. There now exist companies that specialize in providing equipment to take standardized photos for cosmetic research, as well as software to analyze such photos. Again, one such company is FotoFinder. Prices are not overtly available for FotoFinder products, but I surmise costs for needed equipment may be in excess of $10,000. I emailed the company and they did mention that renting equipment for research purposes may be an option. It will have to be considered, but I am skeptical that they will be able to compete on cost with other self-constructed methods. I am also skeptical of the value of their systems above a self-constructed system in general.
- Photos of eyelashes: The same photo equipment will be used to take standardized photos of upper eyelashes, with eyes closed, viewed from above. I am unable to find any studies that explicitly described the setup they used to take standardize eyelash photos. Though here is a study that used equipment from DermaLite to help with photographing eyelashes.22
- Photos of hair prominence: The same photo equipment will again be used to take standardized photos that will be used in “Hair Prominence Assessment” (HPA). These will be three photos taken on the left, right, and back sides of the head respectively. For men, the photography is straightforward. For women, hair will be parted horizontally on each side, and pinned up to make the roots visible. For women, photos will be close ups of just the first few inches of the most recent hair growth emerging from the scalp.
- Photos of face: The same photo equipment will again be used to take photos of the face. Here are examples of studies that describe their methods of taking standardized photographs of the face.23,24 FotoFinder products have also been used in facial photography.25 Though photos will be taken from several angles, we will likely only make use of one photo per subject of the whole frontal face at rest. Headbands will be used to keep the subjects’ hair pulled back. Women will wear their hair in a ponytail. Shirts will all be of uniform color.
Photos of the eyelashes must be taken before the eyelashes are straightened for measurement. The photos of hair prominence and of the face will be taken last in order to allow subjects’ hair and skin to acclimatize to the studio environment.25
At the final meeting, subjects will complete an exit survey with the following questions:
- How often did you use each of the 4 treatments respectively?
- What dermaroller/dermastamp needle length did you work up to (0.5mm, 0.75mm, 1mm, or 1.5mm?)
- Did you soak your dermaroller/dermastamp in rubbing alcohol after each use?
- Did you generally take the full recommended dose of the amino acids/MSM supplement? If not, how much did you generally take?
- Did you take the amino acids on empty stomach?
- How many minutes would it take you to do each of the 4 individual treatments at a given instance?
- In your opinion is your hair loss in complete remission as of today (yes or no)?
- If not, are you continuing to see hair loss improvements in your opinion?
- Are you continuing to see skin improvements in your opinion?
- Would you recommend this combination treatment to others (scale of -3 to 3)?
- Do you plan to continue any of the treatments (yes, no, or unsure for each treatment)? If no, why?
- How often did you replace your dermaroller/dermastamp?
- Did you start any cosmetic product or treatment that we do not know about during the study? If so, when, what, and for how long?
- Did you stop any cosmetic product or treatment that you informed us about at the start of the study? If so, when and what, and for how long?
Real-World Blinded Rating System
By definition, people pursue cosmetic treatments because they wish to appear better in their own eyes, and in the eyes of the people they meet on a day-to-day basis. It is therefore strange that nonmedical persons are never used as raters in cosmetic trials. If for example either expert ratings or objective measurements of something do not highly correlate with real-world ratings, which method of measurement should be considered of low validity? The case can be made that the real-world ratings are valid by definition, even if correlation with objective measurements and both inter and intra-rater reliabilities are all low. Real-world rating potentially provides several advantages over other methods:
- Confidence of real-world value of a treatment
- Feasibility of a large number of raters being employed in a given study, rather than a small number of experts
- Less need for expensive objective measurement technologies and expertise in their use.
- Simpler scoring methods, which are more suitable to “real” people
- Involves and interests the public in the scientific process
None of this is to say however that it is not worth investigating whether or not any real-world rating system does have high inter and intra-rater reliability and high correlation with other methods of measurement.
We wish to explore these issues. Even in the event our interventions provide no benefit, our study will be the first attempt that we are aware of to explore such a real-world rating system. Voluntary raters will be enlisted through the internet. They will be required to have access to a desktop computer (and to do rating through a desktop computer) and be at least 30 years of age. They will provide their age, gender, and country, and must affirm that they have never read any online discussion about the design of this trial before. They must affirm their willingness to repeat the same ratings a second time, two weeks later. They will then be asked to rate sets of photographs viewed through an online system. Multiple sets of raters will be enlisted for different rating purposes:
Fifty raters will be shown both before and after photos of the hair loss of the 50 subjects. The order in which subjects are viewed will be randomized for each rater. Furthermore, the order in which the before and after photos are viewed will be randomized, and will not be labeled. Thus, raters will be blinded to the chronology of the photos. Further still, raters will be told (falsely) that “some” (i.e. an unspecified number) of the subjects they are evaluating represent untreated controls. The will be asked to score the relative difference in hair loss of the photos on the right side of the screen (called “B”) to the photos on the left side of the screen (called “A”) from -3 to +3. A score of -3 implies that B is much worse than A. A score of 3 implies that B is much better than A. A score of 0 implies that B and A are similar. Here are examples of studies that have used such a 7-point scale, albeit not in our blinded fashion.14,26,27
In reality, the claim that some of the subjects are untreated controls will be a lie, but will serve to create doubt in the minds of raters should any subtle systematic signs in the photographic environment contribute to unblinding of chronology, thus controlling any potential bias in favor of the treatment. Without such a measure, obvious improvements in hair loss could also potentially cause unblinding, leading to a risk of treatment efficacy being somewhat exaggerated by raters. However, since raters will be made to believe there are untreated controls among the photos, then obvious differences between before and after photos may equally be suspected as being progression of hair loss, rather than reversal or hair loss. This is desirable. An ethics committee would almost certainly approve of this design, as no actual subjects are placed at risks they did not consent to.
Fifty different raters will be randomly shown only the unlabeled before OR after photos of hair loss of any given subject, but not both. Thus, a given rater would be expected to view roughly 25 before sets of photos and 25 after sets photos. The order in which subjects are viewed are again randomized for each rater. Raters will be told (falsely) that all photos represent “before” photos taken at the start of a clinical trial. They will be provided with graphics of the Hamilton-Norwood scale (for men) and the Savin scale (for women). The will be asked to stage the hair loss of each subject, according to these scales. Each grade on the scales will be assigned a numerical value, as was done for the Hamilton-Norwood scale in this study.28 Subjects whose hair loss does not fit these classifications will be excluded from this rating.
In this context, the lie told to raters serves to completely neutralize all potential for unblinding bias, as raters are not aware that there exists anything to become unblinded to. Again, this should not pose an ethical problem.
Another fifty different raters will assess hair prominence. The method of photo presentation will be exactly the same as in Rating 2. The only difference is that instead of hair loss photos, raters are shown the photos of hair prominence. Raters will be asked to provide a Hair Prominence Assessment (HPA) score for each photo.
HPA is a scale I now define. To the best of my knowledge, there does not presently exist a validated assessment scale for overall hair quality/prominence. If anyone knows of one, please tell me! HPA is analogous to the Global Eyelash Assessment (GEA) scale. GEA is a validated scale of overall eyelash prominence based on length, fullness, and darkness.29–32 In GEA, raters are asked to assign a score of 1 (minimal) to 4 (very marked) with the help of a photonumeric guide. With HPA, raters will likewise be asked to score overall hair prominence from 1 (minimal) to 4 (very marked), based on thickness, gloss, and darkness.
The reason for women having only their roots photographed for this measure is that only the recent hair growth which has occurred during the 1 year of the trial will be able to reflect any enhanced quality resulting from the interventions. Men on the other hand will have seen complete turnover of their hair. The reason the sides of the head are the chosen locations for these photographs is that they will be the least confounded by either the progression or reversal of hair thinning or hair recession that occurs between the start and end of the trial.
We will also seek to first create a photonumeric guide for the HPA. Though if difficulty in obtaining adequate subjects occurs, we may abandon this, and implement HPA without the help of a photonumeric guide.
Another fifty different raters will assess global skin appearance. The method of photo presentation will be exactly the same as in Rating 2. The only difference is that instead of hair loss photos, raters are shown one single photo of the whole frontal face at rest of any given subject. Raters will be asked to guess the age of the person in the photo.
When expert raters are used, it has been shown that estimated age based on such facial photos is highly correlated with actual age, and that inter and intra-rater reliability is high.33 Thus, it is likely this validated measure of facial aging will perform similarly among real-world raters. The simplicity of this measure makes it suitable for real-world raters who may lack the ability to analyze the face in finer detail.
Another fifty different raters will assess eyelash prominence. The method of photo presentation will be exactly the same as in Rating 2. The only difference is that instead of hair loss photos, raters are shown one single photo of the upper eyelashes of any given subject. Raters will be asked to provide a Global Eyelash Assessment (GEA) score (as discussed under Rating 3) for each photo.
All 50 subjects will be given their own complete digital set of before and after photos in an unblinded fashion to assist in their self-rating. The standard 7-point scale of -3 to +3 will be used here to assess hair loss. This scale has been used for self-assessment in the literature.34 The HPA scale will be used to assess hair prominence, and subjects will be asked to assign themselves both a before and an after score, with the added liberty of not being limited to whole numbers on just the “after” score. To utilize this option, subjects will have to pass over the visible options of 1, 2, 3, and 4, explicitly click the “Other” button, and manually type in any decimal number between 1 and 4. Estimated age will be used to assess global skin appearance. Subjects will be asked to input their opinions of how old their face looked or looks both before and after. The before score will be constrained to whole numbers, and the after score can be any decimal number. The GEA scale will be used to assess before and after eyelash prominence, according to the same rules described for self-assessment with the HPA scale.
Repeated Rating Cycles
In order to assess intra-rater reliability, Ratings 1 through 5 will all be repeated a second time. Two weeks after each respective initial rating, the same raters (or as many of them as will follow up) will be asked to rate the exact same photos in the exact same order. One to 4 weeks has been suggested to be sufficient to reduce memory effects.33
Thus, in total, 250 unique online raters will be required to complete all of the above ratings, in addition to the 50 study subjects. The 250 online raters will be used in two rating cycles.
All analyses will be performed for both men and women separately, and for both genders combined. All analyses described are done on data collected from the first rating cycles only, and not the second. The only times the data from the second rating cycles are used is in calculating intra-rater reliabilities:
- Improvement from Rating 1: Relative improvement in hair loss can be calculated from Rating 1. If for example photos B from a given subject, for a given rater, are from “before”, yet were given a score of -3 relative to photos A for that subject, then that -3 would be transformed into a true improvement of +3 associated with the treatment. Average relative improvement for each subject can thus be calculated. Then (of such averages) the average, standard deviation, and median in the overall population can be calculated.
- Inter-rater reliability of Rating 1: Inter-rater reliability for the above improvements in hair loss derived from Rating 1 will be calculated.
- Test of bias in Rating 1: Testing for a significant difference in average rated improvement when photos are ordered one way rather than the other (i.e. before then after, versus after then before) may indicate or rule out bias caused by ordering.
- Improvement from Rating 2: Relative improvement in hair loss can be calculated from Rating 2 as well. For a given subject, average rated grade of hair loss for both before and after can be calculated. The difference in these averages can then be calculated for each subject. Then (of such differences) the average, standard deviation, and median in the overall population can be calculated. A test for significant improvement will also be done.
- Inter-rater reliability of Rating 2: Inter-rater reliabilities for both “before” and “after” staging of hair loss will be calculated separately. Here is an example of a study that calculated inter and intra-rater reliabilities for the Hamilton-Norwood scale.28
- Correlation of improvements from Rating 1 and Rating 2: Correlation between the above two different improvement measures in hair loss (derived from Rating 1 and Rating 2) will be calculated.
- Improvement from Rating 3: Relative improvement in hair prominence will be calculated from Rating 3, just as was done for relative improvement in hair loss from Rating 2. A test for significant improvement will also be done.
- Inter-rater reliabilities of Rating 3: Inter-rater reliabilities for both “before” and “after” scoring of hair prominence will be calculated separately.
- Improvement from Rating 4: Relative improvement in facial age can be calculated from Rating 4. For each subject, average rated age for both before and after can be calculated. The difference in these averages can then be calculated. These differences will then each be adjusted by 1 year to account for the fact that subjects have naturally aged 1 year during the course of the study. Then (of such adjusted differences) the average, standard deviation, and median in the overall population can be calculated. This would yield an estimate of how many years younger subjects look as a result of treatment. A test for significant improvement will also be done.
- Inter-rater reliabilities of Rating 4: Inter-rater reliabilities for both “before” and “after” estimated ages from Rating 4 will be calculated separately.
- Correlation of estimated age and actual age: Correlation between actual subject “before” age and estimated “before” age will be calculated.
- Recomputation of analyses 9 through 11 in a subgroup: Rzany et al. validated estimated age as an assessment scale in a population aged 25 to 66 with Fitzpatrick skin types 1 through IV.33 We should repeat analyses 8 through 10 in just this subgroup to see if there is any serious difference.
- Improvements in hair density and hair thickness: Hair density and hair thickness both before and after will be calculated by software analysis, according to the system of trichoscopy we choose. Before and after differences for each subject will be calculated. Then (of such differences) the average, standard deviation, and median in the overall population can be calculated. Tests for significant improvements will also be done. The sites on the head used to calculate hair thickness will be those from only the back and the sides. The sites on the head used to calculate hair density will be those from only the top.
- Correlations of improvement in hair density and improvements from both Rating 1 and Rating 2: Correlations between improvement in hair density and the improvements in hair loss derived from both Rating 1 and Rating 2 will be calculated. Among subjects that reverse their hair loss, it is possible that this would mostly not involve the sides, back, or front of the head. Thus, it makes sense that we would have calculated hair density based only on sites from the top of the head. Note that our inclusion criteria specifies that subjects must have hair loss that “involves the topmost part of the head”.
- Correlation of improvement in hair thickness and improvement from Rating 3: Correlation between improvement in hair thickness and the improvement in hair prominence derived from Rating 3 will be calculated. As Rating 3 is based on photos of the back and sides of the head, it makes sense that we would have calculated hair thickness based only on sites from the back and sides of the head.
- Improvement in eyelash length: The difference in average before and average after eyelash length will be calculated for each subject. Then (of such differences) the average, standard deviation, and median in the overall population can be calculated. A test for significant improvement will also be done.
- Improvement from Rating 5: Relative improvement in eyelash prominence will be calculated from Rating 5, just as was done for relative improvement in hair loss from Rating 2. A test for significant improvement will also be done.
- Inter-rater reliabilities of Rating 5: Inter-rater reliabilities for both “before” and “after” scoring of eyelash prominence will be calculated separately.
- Correlations of improvement in eyelash prominence and improvements in both hair thickness and eyelash length: The correlation between improvement in eyelash prominence and improvement in hair thickness will be calculated. As will the correlation between improvement in eyelash prominence and improvement in eyelash length.
- Improvement in thumbnail thickness: The difference in average before and average after thumbnail thickness will be calculated for each subject. Then (of such differences) the average, standard deviation, and median in the overall population can be calculated. A test for significant improvement will also be done.
- Self-rated improvements: Self-rated improvements in hair loss, hair prominence, facial age, and eyelash prominence can be calculated from Rating 6, just as was done for real-world raters. Significance tests will also be done (when applicable).
- Correlations of self-rated improvements and real-world rated improvements: Correlations between these self-rated improvements in hair loss, hair prominence, facial age, and eyelash prominence (derived from Rating 6) and the same real-world rated improvements in hair loss, hair prominence, facial age, and eyelash prominence (derived from Ratings 1, 3, 4, and 5 respectively) will be calculated.
- Test of significant difference in hair loss improvements between most aggressive and least aggressive microneedling usage: A test of significant difference in hair loss improvements (based on Rating 1) between those with the lowest microneedling frequency of use (say once per week or less) and the highest frequency of use (say once every 2 days or more) will be performed. As will a test between those with the smallest ultimate needle size (0.5mm or less) and those with the largest ultimate needle size (say 1mm or more for men and 1.5mm or more for women). As these may in part be redundant tests, correlation between frequency of microneedling and needle size will also be calculated to assess this.
- Intra-rater reliabilities for Ratings 1 through 5: Based on all raters that successfully follow-up and complete the second rating cycle, intra-rater reliabilities for Ratings 1 through 5 will be calculated.
- Exit survey statistics: Summary statistics of relevant questions on the exit survey will be calculated.
Adverse Events Reporting and Management
Subjects will have investigator contact information to report unsolicited adverse events at any time. Adverse event checklists will also be administered once a month. For each adverse event, subjects will be asked how many days in the past month did they experience each. Options will be “never”, “7 or fewer days”, “14 or fewer days”, or “more than 14 days”. They will rate each reported adverse event with a severity of mild, moderate, or severe. They will be asked to clarify if the adverse event pertains to the scalp, the face, some other specific part, or just the body in general. They will be asked an open-ended question, “Are the adverse events you experienced acceptable to you overall (yes, no, or unsure)?
Example adverse events for dermarolling are listed here.35 Some select adverse events for minoxidil are likely applicable to other hair growth serums. Example adverse events for minoxidil are listed here.36 Some select adverse events for retinoids are likely applicable to other facial serums. Example adverse events for retinoids are listed here.37 Some select adverse events pertinent to skin-related cosmetic procedures may also be applicable to a facial serum trial. Examples of such adverse events are listed here.38
Regarding the oral amino acid/MSMS supplement, all available evidence suggests MSM is virtually nontoxic.39 One trial of 6 grams per day yielded no reported adverse events after 26 weeks of use.40 This is higher than our proposed 3 grams per day. Our proposed daily dose of 20 amino acids, in sensible ratios totaling 3 grams, is physiologic and not expected to produce adverse events. Hence, in addition to skin and hair-related adverse events on our checklist, a handful of generic adverse events will be included, for which there is no particular evidence (or only anecdotal evidence) of an association with the oral supplement. Amino acids can also be expected to increase muscle mass, which can be considered a beneficial side effect.
Thus, from the above, and with the addition of some potential adverse events from our own experience or concern, an overall checklist is derived as follows:
Increased hair fall, rash, itchiness, soreness, skin infection, visible blood, scarring, discoloration, flakiness, unpleasant smell, dry skin, acne, dry lips, sunburn, peeling, burning, stinging, swelling, cold sores, blisters, stinging from getting product in eyes, local allergic reaction, cold or flu-like symptoms, nausea, abdominal discomfort, bloating, diarrhea, headache, fatigue, dizziness, weakness, cognitive difficulties, changes in mood, irritability.
The above combined checklist is what will be administered. Subjects will be asked to provide any elaboration desired, as well as if they associate adverse events with treatment (i.e. definitely, probably, unsure, or no). Subjects will also be asked questions 1 through 4 from the previously described exit survey, aimed at describing which treatments were in use during the last month, and to what extent. They will also be asked to describe any adverse events not listed. Investigators will ask for further clarification as needed. The principal investigator will review all monthly adverse event reporting. Adverse events deemed serious will be reported to the ethics committee. In-person follow-up, reduction, breaks, or discontinuation of select treatment(s), or complete discontinuation of all treatments will be recommended to the subjects as appropriate.
Equipment: We will need a camera, dermascope, software to analyze the dermascopic photos, stereotactic equipment for repeatable positioning of the camera and of the subjects’ heads, diffuse lighting, and digital calipers. On the low end this might all cost $1500. On the high end it might cost over $10,000.
Incentives: If funds are available, free products will be provided to subjects for the duration of the trial. Retail cost of all the products for 1 year may be around $350 per person. At-cost price may be around $150. For 50 subjects, that would come out to about $7,500. In any case, subjects will pay for their own haircuts and also for their own rubbing alcohol to sterilize their dermarollers/dermastamps with.
Publication fees: An increasing number of reputable journals are offering authors a choice between subscription-based access and open access. We will opt for open access, in which case the authors are expected to cover the fees of publication. This can range from $1000 to $5000.
Ethical review fees: Ethical review fees may cost around $3,000.
Labor: I won’t give a figure for labor, as it is the hardest to estimate. It depends a lot on how much the needed people feel inclined to volunteer their time. There’s the principal investigator, a programmer to create the online rating system, a statistician to analyze the data, and a photographer to set up a studio, to take all the photos, and to create a photonumeric guide for the HPA scale. I will be setting up and coordinating the study, writing and publishing, etc.
Totally ignoring labor, total ballpark cost could be anywhere from $7,500 to $25,000.
This proposed trial offers a novel new method of evaluating cosmetic benefit while employing a number of creative controls. This would be of value to present no matter how effective or ineffective the proposed interventions are. The interventions themselves all already have some level of evidence and will likely provide benefit. This trial would further introduce a new and potentially useful assessment scale for hair prominence.
Our novel methods of control built into the real-world rating system eliminate the need for a control group entirely. This not only has the advantage of requiring fewer trial participants, but of reduced confounding as well. It may not even be feasible to maintain blinding in an RCT that involves microneedling, as there is perhaps no way to administer a sham treatment. This could lead to high dropout rates and increased confounding due to concurrent product use within control groups. Our design avoids this. Furthermore, a placebo facial serum is itself liable to at least be moisturizing, thus leading to underestimation of the benefit of the intervention facial serum, defined as being relative to no treatment.
A washout period of 2 days for the facial serum at the end of the trial seems ideal. If the washout period is shorter, rater unblinding may be increased due to overt transient gloss on the face from the presence of the product itself. If the washout period is longer, then improvements in skin appearance would only reflect the permanent improvements to the skin itself due to protein incorporation, etc. The transient improvements from the presence of the product itself may have washed out by then, which would be undesirable.
This trial would account for a number of considerations that are often overlooked such as including washout periods at the start and end of the trial, documentation of concurrent product use, allowing time for acclimatization to a photographic studio environment, standardizing the length of men’s hair over multiple photo shoots, and ensuring blinding of raters. Given the quality of cosmetic clinical trials as a whole, in my opinion this proposed design is of high quality.
There are expensive meters to directly measure hair gloss. But gloss is only one part of overall hair prominence. It isn’t worth the cost in my opinion, as there is no guarantee it would correlate well with real-world hair prominence assessment (HPA) scores. In that event, I again opine that the real-world scores would be more believable. To date, I am not aware of any validated assessment scale for overall beauty or prominence of hair. We will be attempting to correlate HPA with hair thickness however, as that capability will incidentally come along with whatever dermascopy system we choose, and would not represent an additional investment.
There are also meters for skin elasticity. Again, I don’t think it is worth the cost. Estimated age of face is already a validated measure when done with expert raters, is cheap to do, and is of more intuitive value to consumers. It is likely we will be able to validate this measure with real-world raters as well.
Another thing to consider is that it is always an option to greatly simplify this proposed study and ditch complicated rating systems and expensive equipment. Some quick and dirty before and after hair loss photos could be taken, subjects could do a lot of self-rating, and nails and eyelashes could be measured with a cheap caliper. This would still achieve the ends of giving our group practice in seeing a clinical trial through from start to finish, and of helping to raise funds through promotion of our products. The thing about treatments that really work is that their effects can be obvious even in weak study designs. For example, someone had obvious hair loss, and now they don’t. My personal attitude to research has always been “if it doesn’t work so well that its effects are obvious even in a weak study design, I probably don’t care anyway”. So “cheaping out” is not outside the realm of options.
Self-rating is especially important to include with regards to HPA, since women will be able to do side-by-side comparison of their most recent hair growth contrasted against their older hair growth that occurred before the trial.
Hair loss improvements may be slightly underestimated if subjects with low grades of hair loss too easily achieve remission, as their contributions are thereby capped. We will comment on how many subjects this may have occurred with.
It is crucial women use dermastamping over dermarolling. If dermarolling is used, very careful technique is required to prevent hair breakage. Since dermarolling is used most in areas of thinning hair, causing hair breakage there could lead to the false appearance of worsening hair loss in photographs.
It is possible that hair prominence and/or hair thickness improvements (particularly in eyebrows and eyelashes) that may occur from the oral supplement may bias the age ratings of the face. This should not necessarily be considered an issue, as what matters is that people look better in the eyes of their peers, regardless of whether or not the entirety of the benefit is associated with skin. It is also possible that hair prominence and/or hair thickness improvements may cause reduced appearance of hair loss, independent of any increase in hair density.
Some may criticize this study for having too many outcomes, leading to possible data dredging. This possibility will of course be appropriately discussed in any publication. Furthermore, several of the outcomes are for the sake of correlating with other related measures, which in essence is the opposite of data-dredging, and rather helps to detect it.
I’ve thought that perhaps a version of the online rating platform could be permanently deployed for marketing purposes after completing of the trial. Consumers could be asked to pick the more attractive of two photos (one before and one after) for a list of subjects. At the end of the rating, they can be informed what percentage of time they happened to pick the “after” photos, as well as what was the probability their result occurred by random chance. Scoring a high percentage would imply to specific consumers that the products really work.
Unobvious Incentives for Collaboration
As this is likely the only cosmetic trial we will ever do, a photographer who works with us could be enticed by being allowed to keep the purchased equipment for him/herself. Cosmetic and clinical trial photography could be added to his/her marketable skill set and services.
An incentive for a programmer to work with us is that our proposed online rating platform could be a marketable product that other researchers would want to use.
An HPA photonumeric guide might also be a product other researchers would want to use.
This completes our proposal. This is all very preliminary and constructive criticisms and/or suggestion of potential collaboration are most welcome.