A few weeks back I decided to add this blog to my homeserver setup, in order to have some kind of “brain dump” for topics that I think about. More on that, and why I decided to set up this blog in a later post.
One of the systems I wanted to add to the website was a solution to get a bit of insight on traffic and overall usage of it. So I decided to include some kind of site analytics, Google Analytics, since it seemed to be the easiest to integrate. Even without Analytics configured, I had the data protection statement ready for it. Just in case and so I will not forget it, when finally integrating.
Enter the Austrian data protection agency
Just as I planned to read about how to integrate Analytics heise.de dropped an article stating, that the solution does not align with the European data protection act.
Okay, so I had to find a new solution and I started to have a look in the depths of the internet to find something that fit my needs and found a nice curated list of analytics solutions on github called awesome-analytics.
Having learned from the last minutes, my main goals on the homeserver setup became to have every system self hosted and to use as many open source solutions as possible. That meant a bunch of the tools offered on the list, did not meet at least one of those two requirements. As I read the heise-article privacy became also a concern in my decision, so I wandered to the section called “Privacy focused analytics” and I had a look into a few of the tools offered there.
Since I am lazy by nature I took the solution that looked the nicest and had a possibility to run inside a container, preferably on Kubernetes, Plausible Analytics
Installing Plausible and finding the first strange behaviors
Since the official documentation does only provide an installation guide for docker-swarm I looked if someone had already set up a Kubernetes installation for Plausible.
What I found was a github repository with some manifest files that looked promising. It was simple enough to have the server running in a few steps and did not require to much tweaking in configuration files. In fact only three files had to be altered.
I started to configure all passwords that were necessary, but did not bother to define initial mail-server setups, since I thought I would not need it to get a first impression of the tool. In the end my secret.yaml
looked something like this:
apiVersion: v1
kind: Secret
metadata:
name: plausible
type: Opaque
stringData:
SECRET_KEY_BASE: my-secret-password # a randomly generated string. the longer the better
ADMIN_USER_NAME: my-admin-username # your login user name
ADMIN_USER_EMAIL: mail@mail.mail # your login email
ADMIN_USER_PWD: some-admin-password # 100% up to you
# see postgres.yaml for setting the database password
DATABASE_URL: postgres://postgres:password@postgres-service:5432/postgres
CLICKHOUSE_DATABASE_URL: http://clickhouse-service:8123/plausible
BASE_URL: https://analytics.my-devbox.de # same address you used in ingress.yaml
MAILER_EMAIL: <todo:replaceme> # email@example.com
SMTP_HOST_ADDR: <todo:replaceme> # smtp.example.com
SMTP_HOST_PORT: "465"
SMTP_HOST_SSL_ENABLED: "true"
SMTP_USER_NAME: <todo:replaceme> # email@example.com
SMTP_USER_PWD: <todo:replaceme>
DISABLE_REGISTRATION: "true"
Everything worked fine and I soon had all my pods up and running
$ kubectl get po
NAME READY STATUS RESTARTS
clickhouse-statefulset-0 1/1 Running 0
plausible-5f5ffd7497-nfl66 1/1 Running 0
postgres-statefulset-0 1/1 Running 0
I logged into the server and tried to define the website to be analyzed. When I finished I got the error message below (I will see this a few times in the course of this endeavour)
Having a look into the logs it became clear that the missing mail-server setup was to blame. This wasn’t so much of a problem until I wanted to change my user password. By design, password changes are only possible via password-reset by mail. So I had to define this earlier than I planned, no problem. Trying to get a reset mail, I tested with the defined user from my manifests, which resulted in the same error message as above, indicating an error in my smtp-setup. So far, so good, nothing out of the ordinary.
At some point, just because it was easier to type, I tried to reset the password of test@test.de
and got the following message.
My first thought was “finally, all done” so I tried to reset my real user password and got treated with the error 500 message again.
Wait a moment! So password resets for addresses in the database behave different to ones that are not?! Shouldn’t this be a security concern?
The success message for my wrong mail clearly stated “[…] if it exists in our database.”, meaning that the message should show up in both cases. Even the main issue was with my smtp-setup, I knew I was onto something. The next few minutes I tried to fix the configurations to have a running and well-configured system before starting a deeper analysis. Soon enough I had a mail in my postbox and was ready to do some pen testing.
I remembered that I have seen something about timing attacks back in summer of last year on Youtube.
The results so far pointed towards some behavior that tried to send mails synchronously, waiting for the mail server to respond, before returning an answer to the user.Timing attacks are basically an attack vector where the attacker can get insight of a targets data by measuring and comparing response times when accessing publicly available resources (e.g. web services)
My first mode of attack was to get response times on the password-reset page “by hand”, or rather the developer tools inside my browser. This was sufficient enough to get a first feeling of how bad the situation (at least in my setup) really was.
For an invalid mail I got a response time around 80ms, for the valid one I received a response at around 1200ms - a significant difference.
But one data point isn’t enough to get a resilient enough answer (the difference could have been due to network issues or some other external factors).
Automating my pen test to get more information
Doing these time measuring in my browser wasn’t an option - I wanted to automate it. I started up my Postman and got to work.
The necessary endpoint was easy to identify, due to my experiments before. What I haven’t considered was the use of an CSRF-token on the message, that I had to read out before. Some scripting later I had the scripts all set up and ready to run.
The server only allowed for five consecutive reset attempts, but this was enough for a proof of concept and to show the underlying security issue.
For invalid mail addresses I got an average roundtrip time of 73ms, valid addresses had a roundtrip of 1793ms on average. (Both with a data set of five - not much, but more than what I had before). Also, not one try of the valid mails came even close to the response times for invalid addresses.
At that point I became slightly nervous.
Did I really just found an exploitable issue?
Wouldn't it be possible to check for mails in the servers database just by running a few password-reset attempts per mail?
You would need a significant big dictionary of mail addresses to start with, but there are leaks out there for this. And people tend to reuse their addresses. Sure the issue was no Zero-Day but still concerning enough for me to ask myself What do you do now?
Checking the source code
My next step was to check the source code of Plausible. Good thing I opted for Open Source solutions, that way I can at least confirm my findings in the code - if I am able to read it. I found the culprit in a method called password_reset_request()
. The developers were so kind to name there classes and methods in a clean and meaningful style. That way I was able to read and understand the code, even though I haven’t heard of the language they used.
def password_reset_request(conn, %{"email" => email} = params) do
if PlausibleWeb.Captcha.verify(params["h-captcha-response"]) do
user = Repo.get_by(Plausible.Auth.User, email: email)
if user do
token = Auth.Token.sign_password_reset(email)
url = PlausibleWeb.Endpoint.url() <> "/password/reset?token=#{token}"
Logger.debug("PASSWORD RESET LINK: " <> url)
email_template = PlausibleWeb.Email.password_reset_email(email, url)
Plausible.Mailer.deliver_now!(email_template)
render(conn, "password_reset_request_success.html",
email: email,
layout: {PlausibleWeb.LayoutView, "focus.html"}
)
else
render(conn, "password_reset_request_success.html",
email: email,
layout: {PlausibleWeb.LayoutView, "focus.html"}
)
end
else
render(conn, "password_reset_request_form.html",
error: "Please complete the captcha to reset your password",
layout: {PlausibleWeb.LayoutView, "focus.html"}
)
end
end
So I was right, the code really did send out a mail, waited for the response from the smtp-server and then rendered the success-message in case the user was found.
Next was opening an issue. The maintainers had a separate page describing their policy for security vulnerabilities, which mentioned a mail address to disclose issues.
Responsible disclosure
Having collected all data and information I wrote an email to the mentioned address describing my findings.
Hi Plausible Team,
[…]
When setting up the Server, I misconfigured my mail-configurations and found some strange behavior: I got a success message when trying ro recover the password for an unknown user, but a way longer roundtrip and a failure-message when doing so for the correct user.
I had a look into your code für the password_reset_request found, that you configured the mail-delivery as a synchronous call before rendering the success page https://github.com/plausible/analytics/blob/9022234aa6546a146929556b3ef3811b6d42b5a3/lib/plausible_web/controllers/auth_controller.ex#L250
This could create ( together with a sufficient big dictionary of leaked mail-addresses ) a potential timing attack vector to check for the addresses in your database. After fixing my environment, I ran a test against it to check for response times (both based on 5 consecutive requests):
Trying to recover for wrong mail: 73ms roundtrip on average Trying to recover for existing mail: 1793ms roundtrip on average
Since you are returning a success in both cases, wouldn’t it be more secure to invoke the mail-delivery after rendering the success message?
Feel free to contact me at any time, if you have further questions.
KR Daniel
Of course I knew of the concept of responsible disclosure, but it didn’t came to my mind until i received the answer from Uku, one of the developers.
Hey Daniel,
You are absolutely correct. We explicitly protect against timing attacks on the login page to prevent leaking email addresses. The same precaution must be taken on the password reset page as well but it slipped my mind.
Thank you for disclosing this privately and responsibly.
[…]
Thanks again, Uku
At that point I realized, that talking about this topic probably falls under some kind of gentlemen’s agreed NDA, until the issue was resolved.
But since Uku answered in a matter of a few hours, I was certain, that a fix would be on the way soon. Nevertheless I asked if I could write about the finding as my first blog post here and got the permission to do so, as long as I waited until a fix is released.
Waiting a few days I received a mail stating that Uku released a fix release and asking me to cross check the new version.
As soon as I got home from work I spun up a system with the new version. At that point I was happy I choose for a containerized solution, since it only took a few seconds to configure my Kubernetes-manifests and start it up.
Manual checks looked fine, so I run a few automated tests.
First with my initial use case. The way I found the issue - with misconfigured smtp-settings. The times looked fine, 79ms for unknown and 80.8ms for known mails on average over 5 tries each. Additionally in both cases the success page was returned, so even with a wrong configuration a leak could not not appear.
Next was to check the potential security issue in productive environments - let’s be honest, the smtp-settings should be configured correctly on a live system.
This time I checked with 10 data points in each configuration and got an average of 78.9ms for wrong mails and 83.5ms for correct mail addresses in roundtrip times. In my opinion these numbers were similar enough.
The statistics nerd in me said that ~5% is still statistically significant, though.
But, for this, I had to shut him up.
Having checked that the mail delivery was still working, I confirmed the successful fix to Uku.
Hi Uku,
great to hear this. […] In my opinion this looks good and similar enough to not get any significant difference from just a few tries per mail address. When having the smtp-server configured correctly the mails were also send correctly (so no functional error was introduced by this fix)
Thank you for your support.
Do you need any more information from me, to close this issue?
Thanks Daniel
Exchanging some additionally messages with him I considered this issue as resolved and started to plan this article.
So, what did i learn from this?
The whole experience was quite exciting, since it was the first security problem I found. My initial concerns quickly became unfounded.
Uku acknowledged the finding instantly, which resulted in a massive confidence boost for me. This just showed me again, how amazing the open source community is. The main goal is to deliver cool and amazing software.
Personally I have seen, that having a bit of knowledge outside of my comfort-zone can open up amazing opportunities and experiences. Even without knowing how real timing-attacks are run in the field, just knowing the concept helped to identify and contextualize the issue. Having a basic understanding for source code helped to analyse and find the root cause.
Of course I updated my own Plausibe Server instantly and will be happy to see some more traffic over the next few days. And since this is now my site analytics solution to go, I had to update my data protection statement.
Finally I hope I can interpret this opportunity for a first post on this website as a good sign for the year(s) to come. 😃