Hi everyone,
I’m a PhD student in Computer Science researching why people choose to self-host software—what motivates you, what concerns you, and what factors affect your decision-making.
To better understand this, I’ve prepared a short anonymous survey (~10 minutes). Your insights as part of the self-hosting community would be incredibly valuable for this research.
🔗 Survey link: https://survey.lpt.feri.um.si/376953?newtest=Y&lang=en&s=ls
This study is part of my doctoral research at the University of Maribor, Slovenia, conducted under the supervision of Assist. Prof. Lili Nemec Zlatolas, PhD. All responses are anonymous and used strictly for academic purposes.
If you’ve ever self-hosted anything—or even just considered it—I’d really appreciate your input.
Thanks a lot for your time, and feel free to ask me anything about the project ([email protected])!
Cheers!
I’m a little concerned about selection bias (because obviously).
I also want to know about people who are not aware of self-hosting. If they’d be interested or even try.
That’s a very valid concern, and you’re absolutely right to bring it up.
One existing study that surveyed the general population found that about 8.4% of respondents were self-hosting users, which means that in order to get enough self-hosters from the general population for meaningful analysis, we’d need a very large sample.
Unfortunately, we don’t have the funding or resources to conduct such large-scale research through a representative panel or agency. That’s why this study is focusing on communities where self-hosting is already discussed, like this one.
That said, we’re definitely aware of this limitation, and we’re also sharing the survey in broader, more general-interest online communities where we expect non-self-hosters (or people unfamiliar with the concept) to be more present. This will allow us to include comparisons between the two groups in the analysis.
Really appreciate your thoughtful comment — thanks!
One existing study that surveyed the general population found that about 8.4% of respondents were self-hosting users
Wow! That’s a lot higher than I would’ve expected. My guess would’ve been about 1%, or maybe even an order of magnitude or so less than that.
Yeah, it surprised me too! If you want to read more about it, check out the paper titled “Towards Privacy and Security in Private Clouds: A Representative Survey on the Prevalence of Private Hosting and Administrator Characteristics” by Gröber et al. (2024).
I suspect there’s a tendency of experts in something to think of people who do it narrowly as people doing at least as much as they are.
The people who have a bunch of docker services, or complex multi-machine infrastructure are self-hosted software users, and probably in that 1-2% range. People who heard piholes are useful, so they bought a pi 3 and set it up are self-hosted software users. Somebody using an old desktop they got on Facebook marketplace for running Plex media are self-hosted software users… and so on. So are the people in their houses, some of their friends and family.
Using that inclusive definition, being closer to 10% than 1% makes sense to me.
My guess is that it also included things like the 12 year old hosting a Minecraft server for their friends. Which, to be clear, is a totally valid self-hosting use case.
I sort of fit that category. I am aware of self-hosting, even somewhat interested. But I know absolutely nothing about it, and if I’m being honest, too lazy to research it.
Truthfully, I haven’t owned my own PC/Laptop in over a decade. I just use the one I get from work if I need to do something on a computer. I preferred gaming on a PS4/5 so I could just relax on the couch with a controller instead of sitting in a chair at a desk. I recently got a steam deck and love it. I want to poke around desktop mode some more so I can get more familiar with Linux.
I use self-hosted services in the following categories as much as possible…
That question could really use a “not applicable” option. I don’t operate any home automation solutions, so any answer from me would be invalid, and neutral answers because the item is not relevant will appear the same as neutral answers because I use both self-hosted and externally hosted solutions (e.g. Mullvad for privacy and Tailscale to get around CGNAT).
Thanks for the comment: that’s a really good point to raise.
Just to clarify: the statement “I use self-hosted services in the following categories as much as possible” is meant to reflect how fully you make use of self-hosted solutions in each area. A response like “Strongly agree” would indicate that you actively use and take full advantage of self-hosting in that category.
If you don’t use solutions in a particular category at all — whether that’s because you don’t need them, aren’t interested, or use only external services — then it’s completely appropriate to select a disagreeing option (e.g. “Disagree” or “Strongly disagree”). In this context, lower agreement simply indicates low or no use, regardless of the reason.
From a methodological standpoint, the data will be analyzed using structural equation modeling (SEM). This approach requires a complete set of responses across the measured constructs. If we included a “not applicable” option, it would create missing values in the dataset and potentially lead to excluding the entire response for that part of the analysis — which would significantly reduce the usable sample size.
That said, I really appreciate your feedback! :)
Be prepared for some respondents to choose the middle option as a proxy for “not applicable,” because that’s what I did.
Have you thought about contacting Louis Rossmann? He created an extensive video guide on how to self host using FOSS. Perhaps he’d be willing to highlight your survey to his over 2 million subscribers.
That’s a good idea, and maybe even Henry from Techlore.
Done, though some of the questions were redundant or weirdly phrased.
Done. Nobody else wants to know why I have 3 RasPi’s running stuff around the house, so I get to tell you in the survey, lol.
Did my part! Good luck!
Thanks a lot, really appreciate it! :)
Page 2 seems to have a lot of redundant question.
I intend to continue using self-hosting services in the future if possible.
I will use self-hosting services regularly in the future if possible.
I will frequently use self-hosting services in the future if possibleThis survey doesn’t distinguish between levels of cloud service provider, so I was a little confused.
Virtual private servers, cloud virtual servers (like AWS), cloud-based software where you provide code or a program and the cloud system runs it on a server of its choosing, and cloud-based systems where someone else provides the software (like Google Docs).
Thank you for your feedback and for completing the survey. The first part of the survey primarily focuses on Software as a Service (SaaS). We appreciate your input and will consider ways to clarify this in future surveys.
Good Luck Luka!
I feel like I’m a minority in this group in that I really don’t like self hosting but I do it anyways because it gives me the things I want from a content/privacy/control/ownership perspective.
Thanks so much – really appreciate it! :)
PhDs are hard, don’t get discouraged if you get told to rewrite tons of things. My Dad had to rewrite many parts to his dissertation, the arbitrary nature of the rewrites was the hardest part to deal with for him. Hopefully you have better advisors!
Good luck on the thesis and I hope my data points can assist your research! I’m sure the community would love to see your finished thesis when it is done
- It’s educational for those who have a lust for learning.
- It’s fun.
- It’s far more private than using commercial cloud services.
I did it for ya…good luck with your phd
Thank you so much, really appreciate it! :)
I added my answers. Good luck on your thesis!
Thanks a lot for your input and kind wishes, really appreciate it!
Done but I felt lots of questions to be very similar. Maybe there is a form platform that can show only a subset of control questions for every survey.
Thank you for completing the survey and for your thoughtful feedback. The similarity between some questions is intentional and follows common scientific practice when measuring complex or abstract concepts. Using multiple, slightly varied items that target the same construct increases the reliability and validity of the data by capturing subtle nuances and reducing the influence of random response variation. While your suggestion to show only a subset of such items through adaptive platforms is valid and worth exploring, fixed item sets are generally preferred in research settings to ensure consistent and robust measurement. We appreciate your input and will consider it in future survey design improvements.
To be honest, if 3-4 questions in a row had same-ish wording, I just replied the same thing 3-4 times.
Yeah, those data questions are really loaded. I don’t host for privacy or what not. It’s because of a learning objective, to study, experiment, and run automated stock trading algorithms. I don’t exactly have anything to hide from private companies.
Thank you for your comment. The use of similar statements is a common practice in this type of research, as it helps to better capture different aspects of a construct and ensures reliability. I understand that privacy may not be your personal motivation for self-hosting, and that’s perfectly fine. The purpose of this survey is to explore a variety of factors that can influence why individuals choose to self-host, and to determine the relative importance of each. Even if certain factors don’t apply to you, your responses contribute to a broader understanding of the motivations behind self-hosting. Thank you again for taking the time to complete the survey.