• 1 Post
Joined 1Y ago
Cake day: Jun 15, 2023


OK mman, dont pop a vein over this

That’s incredibly rude. At no point was I angry or enraged. What you’re trying to do is minimize my criticism of your last comment by intentionally making it seem like I was unreasonably angry.

I was going to continue with you in a friendly manner, but screw you. You’re an ass (and also entirely wrong).

A lot of what you said is true.

Since the TPU is a matrix processor instead of a general purpose processor, it removes the memory access problem that slows down GPUs and CPUs and requires them to use more processing power.

Just no. Flat out no. Just so much wrong. How does the TPU process data? How does the data get there? It needs to be shuttled back and forth over the bus. Doing this for a 1080p image with of data several times a second is fine. An uncompressed 1080p image is about 8MB. Entirely manageable.

Edit: it’s not even 1080p, because the image would get resized to the input size. So again, 300x300x3 for the past model I could find.


Look at this repo. You need to convert the models using the TFLite framework (Tensorflow Lite) which is designed for resource constrained edge devices. The max resolution for input size is 224x224x3. I would imagine it can’t handle anything larger.


Now look at the official model zoo on the Google Coral website.


Not a single model is larger than 40MB. Whereas LLMs start at well over a big for even smaller (and inaccurate) models. The good ones start at about 4GB and I frequently run models at about 20GB. The size in parameters really makes a huge difference.

You likely/technically could run an LLM on a Coral, but you’re going to wait on the order of double-digit minutes for a basic response, of not way longer.

It’s just not going to happen.

when comparing apples to apples.

But this isn’t really easy to do, and impossible in some cases.

Historically, Nvidia has done better than AMD in gaming performance because there’s just so much game specific optimizations in the Nvidia drivers, whereas AMD didn’t.

On the other hand, AMD historically had better raw performance in scientific calculation tasks (pre-deeplearning trend).

Nvidia has had a stranglehold on the AI market entirely because of their CUDA dominance. But hopefully AMD has finally bucked that tend with their new ROCm release that is a drop-in replacement for CUDA (meaning you can just run CUDA compiled applications on AMD with no changes).

Also, AMD’s new MI300X AI processor is (supposedly) wiping the floor with Nvidia’s H100 cards. I say “supposedly” because I don’t have $50k USD to buy both cards and compare myself.

And you can add as many TPUs as you want to push it to whatever level you want

No you can’t. You’re going to be limited by the number of PCI lanes. But putting that aside, those Coral TPUs don’t have any memory. Which means for each operation you need to shuffle the relevant data over the bus to the device for processing, and then back and forth again. You’re going to be doing this thousands of times per second (likely much more) and I can tell you from personal experience that running AI like is painfully slow (if you can get it to even work that way in the first place).

You’re talking about the equivalent of buying hundreds of dollars of groceries, and then getting everything home 10km away by walking with whatever you can put in your pockets, and then doing multiple trips.

What you’re suggesting can’t work.

ATI cards (while pretty good) are always a step behind Nvidia.

Ok, you mean AMD. They bought ATI like 20 years ago now and that branding is long dead.

And AMD cards are hardly “a step behind” Nvidia. This is only true if you buy the 24GB top card of the series. Otherwise you’ll get comparable performance from AMD at a better value.

Plus, most distros have them working out of the box.

Unless you’re running a kernel <6.x then every distro will support AMD cards. And even then, you could always install the proprietary blobs from AMD and get full support on any distro. The kernel version only matters if you want to use the FOSS kernel drivers for the cards.

getting a few CUDA TPUs


Those aren’t “CUDA” anything. CUDA is a parallel processing framework by Nvidia and for Nvidia’s cards.

Also, those devices are only good for inferencing smaller models for things like object detection. They aren’t good for developing AI models (in the sense of training). And they can’t run LLMs. Maybe you can run a smaller model under 4B, but those aren’t exactly great for accuracy.

At best you could hope for is to run a very small instruct model trained on very specific data (like robotic actions) that doesn’t need accuracy in the sense of “knowledge accuracy”.

And completely forgot any kind of generative image stuff.

Are CUDAs something that I can select within pcpartpicker?

I’m not sure what they were trying to say, but there’s no such thing as “getting a couple of CUDA’s”.

CUDA is a framework that runs on Nvidia hardware. It’s the hardware that will have “CUDA cores” which are large amounts of low power processing units. AMD calls them “stream processors”.

You could also completely forego the GPU and get a couple of CUDAs for a fraction of the cost.

What is this sentence? How do you “get a couple of CUDA’s”?

I would never use anything other than ZFS. Proxmox is just Debian with a management UI. You can setup disks, volumes, etc, with the web UI. And whatever you can’t do from there you can do via the shell or ssh like you would want other Linux system.

I know how debugging works. I’ve been a developer for a couple decades.

I know for a fact that the lines I removed are normal verbose messages and entirely unrelated to my issue. I know not only because I’m a developer and understand the messages, but also because those lines show up every second of every minute of every day. They are some of the most verbose lines in the logs. The scheduled task for the subtitles only runs once a day and finishes within a few minutes.

Also, they weren’t indicative of any code path because of how frequent they were. At such a high frequency it becomes impossible to determine which line came first in multi-threaded or asynchronous tasks.

I literally have a pinned tab for a Whisper implementation on github! It’s on definitely my radar to check out. My only concern is how well does it do things like multiple speakers and does it generate SDH subtitles? It’s the type that has those extra bits like “Suspenseful music” and “[groans]”, “[screams]”, etc. All the stuff someone hard of hearing would benefit from.

Why alter the logs?

I was trying to be helpful by removing 14k irrelevant lines from a very large, and incredibly verbose, log file.

For $5, I can’t say I’d bother going back and forth with you about how to send a raw log.

This hardly was the issue or point of the post.

even though they explicitly told you why and to not edit them

LMAO. Literally nowhere in a single screenshot did anyone say “don’t edit the logs”.

I think you’re smoking up waaay too much, my dude. Either that or you definitely are the person in the other end of my email convo. I’m getting more and more convinced of it. No one else would be so driven to make me out to be the bad guy here. Each of your comments are getting downvoted because the stuff you’re saying is bonkers.

So again, either you’re growing and smoking way too much weed. Or you really are the kind worded support person that deleted my account so unceremoniously. It’s one of the two.

are now flaming someone for wanting to figure out if it was a them or you issue.

No. I’m sharing my experience with someone who sold a service that didn’t work for me, and in response to asking for a refund told me “I’m tired of talking with you” and deleted my account.

There is a checkbox to turn off logging those that you might want to consider unchecking.

Oh? I have to take a look when I have a chance. Thanks for mentioning that.

there’s a process to go through it and ignored it….

The process everywhere is:

  • request refund

that’s it. Nothing else is required. Anything else is optional.

Then when they asked for logs you just shot right to refund.

No, I provided logs, twice. Then they ghosted me for almost a month. I’m not complaining, all I did was reply again asking if they could do the refund.

You seem to be missing a hugely important point here. I didn’t want tech support, just a refund. The core tech issue did not matter. They were pushing for logs, and I went along with it. Regardless if the logs I provided were complete or not, I got told off for asking (not demanding) a refund NOT tech support.

Edit: why are you assuming that I deleted the “vast majority” of the log? Where did I mention the total size of the log?

Yes, they did give me a refund. But I wouldn’t want to do that as doing a chargeback can be incredibly messy for the vendor. I don’t want to be petty here.

I did, because I know they weren’t relevant. They were part of Jellyfin itself and not the plugin. It’s just a warning saying that a database query was slow (12ms). Since I wasn’t doing much on the server for the past few days, half the log was the warning (not an error).

So no, they weren’t part of the problem. I know they aren’t.

Edit: grammar

You stated that you are a Dev yourself, but then I was expecting that you should have tried to check their API and make the calls with curl, Postman, Insomnia or whatever, but apparently you never tried.

You’re absolutely right. I didn’t. Because I wasn’t invested in troubleshooting it. I have a full-time job, a family, etc.

The issue here is not about what wasn’t working. The issue here is being told off when simply asking for a refund.

The support person has even acknowledged that my profile was showing no downloads.

I am pretty sure they have monitoring on their API backend and can spot a problem

They are, as evidenced by the screenshot the support person shared showing the number of API calls. And they actually did have a problem with the API, which required an update to the plugin, which is all laid out at the start of my post.

OpenSubtitles Hostility
So this isn't meant to be a post bashing the devs/owner of OpenSubtitles. This is meant simply as awareness. A few months ago I signed up for the VIP tier at OST ($5/mo for 1000 downloads a day) for a bit to populate my catalogue of videos with subtitles as my father uses my Jellyfin server and he's lost a lot of his hearing. I also wanted to support the development a bit. At first the service seemed to be downloading a bit, but then it stopped. I waited a few days and it would download at most one or two a day (despite a few thousand videos not having any subtitles). I look around online and found that OST had changed their API and the Jellyfin plugin still needed to catch-up with a newer release. No big deal, so I just waited. Then the update released which specifically stated that the changes to the API calls were made. I waited a few days, nothing. I uninstalled the OST plugin and reinstalled, still nothing. So I figured something was wrong either on my end or the server-side, but I didn't want to bother getting into it. I've been planning to rebuild my Jellyfin server with newer hardware with HW acceleration for decoding and encoding. I sent an email to OST support explaining what I've been seeing and asked if I could get a refund. The person who responded asked for logs so that they could help troubleshoot. So I obliged. ![Email response from OpenSubtitles support confirming there was an issue](https://lemmy.world/pictrs/image/9a222337-3bc1-4e6e-87f5-c6f42ece9f5a.png) They said it wasn't much help and to get even more logs. Which I provided again. ![Screenshot of user CeeBee providing logs via email to OpenSubtitles support](https://lemmy.world/pictrs/image/7b91d5c9-88eb-4dbe-90a6-da205797681b.png) I even removed over 14 thousand "[query]" lines to make the logs more readable. They said there wasn't anything there that was useful, and asked me to try again. I indicated that Jellyfin has a scheduled job that checks for missing subtitles and pulls as needed once a day. But I said that at this point I'm just looking for the refund. A while passes by but then I get a notification that the subscription is going to be renewed again, so I cancelled before that happened and reached out again about the refund. At this point it was more about the principle of the matter as I originally just asked for a refund and that got side-stepped into a support request. Then I got this as a response: ![Email response from OpenSubtitles support being aggressive and accusatory](https://lemmy.world/pictrs/image/06424767-c977-4126-a87a-1b17dfd5c3cc.png) Which resulted in this: ![Email response from OpenSubtitles support saying "I'm tired of you" and deleted my account](https://lemmy.world/pictrs/image/8379ff70-2753-44a5-9e5f-d24d1eef7ca7.png) I waited over two weeks to write this post. I wanted to wait and see if somebody replied back to me with even just an apology or something. If they had originally told me that doing refunds is hassle for them I would have let it go. But telling me off and then deleting my account is just... special. I was astonished at the response and cannot fathom that being the response from any company taking payments for a service. And I'm not holding a grudge of any kind and I get it, I used to do IT support and some days can be tough dealing with annoying emails. But in my defence all I asked for was a refund because something wasn't working. In any case, I just wanted to bring this to the attention of the Self-hosting community so that others can make more informed decisions. To be clear, I'm not advocating anyone to pull support. In face I think they should have more support as it's an invaluable service. Despite the treatment I still plan on getting the VIP subscription again at some point after I rebuild my Jellyfin server. But I also don't think that customers should be treated like this.

FreeNAS is a deprecated version now. The successor (which is basically the same thing) is TrueNAS. They also have a version based purely on Linux called TrueNAS Scale. Both Community and Enterprise versions are available. The Community version is entirely free. It supports VMs through KVM and containerization, as well as all the network sharing options out there.

Another option is Proxmox. It’s Debian based and is more focused on virtualization than storage, but it has whatever you would really need for storage (including full ZFS support). You might find yourself in the command line for some things with Proxmox over TrueNAS, but if you were willing to go full Ubuntu I imagine that wouldn’t be an issue.

That being said, if you want to just go the manual route, then I suggest Debian. It’s leaner and considered more stable than Ubuntu, and doesn’t have some of the cruft that Ubuntu has (like Snaps), which may be a positive or a negative depending on what your needs are.

Edit: just to add, since you’re going to run Jellyfin and Nextcloud on these systems, my recommendation is Proxmox as it has great tooling for managing VMs, like automatic backups. I personally run both Nextcloud and Jellyfin in their own VMs. I like the workflow of backing up the entire VM and being able to restore it to the exact state when it was saved. Containers require a bit more knowledge to run them to be truly stateless, and then you have to worry about backing up your stateful data (like configuration files, etc) separately.

The traffic goes through a wireguard connection. Tailscale is just a facilitator to initiate the connections. There’s more to it than that, but that’s the basic gist of it.

The core technology is wireguard, and you could set everything up yourself, but plain wireguard can be a chore and pain to get all setup. Tailscale is honestly 5 minutes to get a basic connection going.


That’s the only word you need. Ultimately, traditional VPN is outdated and almost obsolete. Wireguard is the “next iteration” of network tunneling tech. And Tailscale just makes it super simple.

The best/easiest way to get started with a self-hosted LLM is to check out this repo:


Its goal is to be the Automatic1111 of text generators, and it does a fair job at it.

A good model that’s said to rival gpt-3.5 is the new Falcon model. The full sized version is too big to run on a single GPU, but the 7b version “only” needs about 16GB.


There’s also the Wizard-uncensored model that is popular.


There are a ton of models out there with new ones popping up every day. You just need to search around. The oobabooga repo has a few models linked in the readme also.

Edit: there’s also h20gpt, which seems really promising. I’m going to try it out in the next couple days.
