I’m looking into setting up some monitoring combined with simple automation for my selfhosting. Currently I was thinking about using Zabbix.
I want to:
Track bandwidth usage on a router/fw and on a managed switch and track cpu/ram/disk usage on my vms.
Simple monitoring (up/down/maintenance) on the router, switch, my vms as well as on linux services (jellyfin/forgejo/etc) and windows services (lab for studying work-related tools).
I’m also interested in doing simple https checks on my webuis (i’ve had a service running but the website returning both 403 and 404 before) and testing nslookup on my internal dns (if the service is up but the lookups timeout I still want to try restarting the service).
Is there any FOSS/FLOSS alternatives that I should look into before diving into Zabbix?
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.
Rules:
Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
Resources:
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
I’ve been using Zabbix for ages now. It has issues but I got used to it.
Also uptime kuma for fast and easy up/down, web services, etc.
I use netdata (the FOSS agent only, not the cloud offering) on all my servers (physical, VMs…) and stream all metrics to a parent netdata instance. It works extremely well for me.
Other solutions are too cumbersome and heavy on maintenance for me. You can query netdata from prometheus/grafana [1] if you really need custom dashboards.
I guess you wouldn’t be able to install it on the router/switch but there is a SNMP collector which should be able to query bandwidth info from the network appliances.
Gonna check it out!
Is it easy to setup automatic responses to the alerts, f.e. restarting a service if it isn’t answering requests in a timely manner?
Have you used it together with Windows Servers too?
No
It should be possible using
script to execute on alarm = /your/custom/remediation-script
https://learn.netdata.cloud/docs/alerts-&-notifications/notifications/agent-dispatched-notifications/agent-notifications-reference. I have not experimented with this yet, but soon will (implementing a custom notification channel for specific alarms)I’d rather find the root cause of the downtime/malfunction instead of blindly restarting the service, just my 2 cents.