@h3ndrik

h3ndrik@feddit.de · edit-2 4 months ago

AI Is a Black Box. Anthropic Figured Out a Way to Look Inside

…Concerning our earlier disagreement about the inner workings of large language models and whether there are ‘concepts’ stored inside…

h3ndrik@feddit.de · 5 months ago

Idk, a plant? a nintendo emulator? enlighten me…

h3ndrik@feddit.de · edit-2 5 months ago

I see Github as a mere tool. As I could use a proprietary operating system like Windows on my development computer, I can use Github to distribute the code. It doesn’t have that severe consequence to the open source project itself and works well. And it’s relatively transparent. Users can view issues etc without submitting to Microsoft. And it’s been the standard for quite some time.

I’m far more concerned with FLOSS projects using platforms like Discord, which forces their users to surrender their privacy and that actively contribute to the enshittification of the internet. I wouldn’t want to be part of that.

h3ndrik@feddit.de · edit-2 5 months ago

I mean the chinese room is a version of the touring test. But the argument is from a different perspective. I have 2 issues with that. Mostly what the Wikipedia article seems to call “System reply”: You can’t subdivide a system into arbitrary parts, say one part isn’t intelligent and therefore the system isn’t intelligent. We also don’t look at a brain, pick out a part of it (say a single synapse), determine it isn’t intelligent and therefore a human can’t be intelligent… I’d look at the whole system. Like the whole brain. Or in this instance the room including him and the instructions and books. And ask myself if the system is intelligent. Which kind of makes the argument circular, because that’s almost the quesion we began with…

And the turing test is kind of obsolete anyways, now that AI can pass it. (And even more. I mean alledgedly ChatGPT passed the “bar-exam” in 2023. Which I find ridiculous considering my experiences with ChatGPT and the accuracy and usefulness I get out of it which isn’t that great at all.)

And my second issue with the chinese room is, it doesn’t even rule out the AI is intelligent. It just says someone without an understanding can do the same. And that doesn’t imply anything about the AI.

Your ‘rug example’ is different. That one isn’t a variant of the touring test. But that’s kind of the issue. The other side can immediately tell that somebody has made an imitation without understanding the concept. That says you can’t produce the same thing without intelligence. And it’ll be obvious to someone with intelligence who checks it. That would be an analogy if AI wouldn’t be able to produce legible text. But instead a garbled mess of characters/words that are clearly not like the rug that makes sense… Issue here is: AI outputs legible text, answers to questions etc.

And with the censoring by the ‘chinese government example’… I’m pretty sure they could do that. That field is called AI safety. And content moderation is already happening. ChatGPT refuses to tell illegal things, NSFW things, also medical advice and a bunch of other things. That’s built into most of the big AI services as of today. The chinese government could do the same, I don’t see any reason why it wouldn’t work there. I happened to skim the paper about Llama Guard when they released Llama3 a few days ago and they claim between 70% and 94% accuracy depending on the forbidden topic. I think they also brought down false positives fairly recently. I don’t know the numbers for ChatGPT. However I had some fun watching the peoply circumvent these filters and guardrails, which was fairly easy at first. Needed progressively more convincing and very creative “jailbreaks”. And nowadays OpenAI pretty much has it under control. It’s almost impossible to make ChatGPT do anything that OpenAI doesn’t want you to do with it.

And they baked that in properly… You can try to tell it it’s just a movie plot revolving around crime. Or you need to protect against criminals and would like to know what exactly to protect against. You can tell it it’s the evil counterpart from the parallel universe and therefore it must be evil and help you. Or you can tell it God himself (or Sam Altman) spoke to you and changed the content moderation policy… It’ll be very unlikely that you can convince ChatGPT and make it comply…

h3ndrik@feddit.de · 5 months ago

people wrote down history by weaving fabric […]

Hmm. I think in philosophy that thought experiment is known as chinese room

h3ndrik@feddit.de · edit-2 5 months ago

I’m sorry. Now it gets completely false…

Read the first paragraph of the Wikipedia article on machine learning or the introduction of any of the literature on the subject. The “generalization” includes that model building capability. They go a bit into detail later. They specifically mention “to unseen data”. And “leaning” is also there. I don’t think the Wikipedia article is particularly good in explaining it, but at least the first sentences lay down what it’s about.

And what do you think language and words are for? To transport information. There is semantics… Words have meanings. They name things, abstract and concrete concepts. The word “hungry” isn’t just a funny accumulation of lines and arcs, which statistically get followed by other specific lines and arcs… There is more to it. (a meaning.)

And this is what makes language useful. And the generalization and prediction capabilities is what makes ML useful.

How do you learn as a human when not from words? I mean there are a few other posibilities. But an efficient way is to use language. You sit in school or uni and someone in the front of the room speaks a lot of words… You read books and they also contain words?! And language is super useful. A lion mother also teaches their cubs how to hunt, without words. But humans have language and it’s really a step up what we can pass down to following generations. We record knowledge in books, can talk about abstract concepts, feelings, ethics, theoretical concepts. We can write down how gravity and physics and nature works, just with words. That’s all possible with language.

I can look it up if there is a good article explaining how learning concepts works and why that’s the fundamental thing that makes machine learning a field in science… I mean ultimately I’m not a science teacher… And my literature is all in German and I returned them to the library a long time ago. Maybe I can find something.

Are you by any chance familiar with the concept of embeddings, or vector databases? I think that showcases that it’s not just letters and words in the models. These vectors / embeddings that the input gets converted to, match concepts. They point at the concept of “cat” or “presidential speech”. And you can query these databases. Point at “presidential speech” and find a representation of it in that area. Store the speech with that key and find it later on by querying it what obama said at his inauguration… That’s oversimplified but maybe that visualizes it a bit more that it’s not just letters of words in the models, but the actual meanings that get stored. Words get converted into an (multidimensional) vector space and it operates there. These word representations are called “embeddings” and transformer models which is the current architecture for large language models, use these word embeddings.

Edit: Here you are: https://arxiv.org/abs/2304.00612

h3ndrik@feddit.de · edit-2 5 months ago

Hmm. I’m not really sure where to go with this conversation. That contradicts what I’ve learned in undergraduate computer science about machine learning. And what seems to be consensus in science… But I’m also not a CS teacher.

We deliberately choose model size, training parameters and implement some trickery to prevent the model from simply memorizing things. That is to force it to form models about concepts. And that is what we want and what makes machine learning interesting/usable in the first place. You can see that by asking them to apply their knowledge to something they haven’t seen before. And we can look a bit inside at the vectors, activations and stuff. For example a cat is closer related to a dog than to a tractor. And it has learned the rough concept of cat, its attributes and so on. It knows that it’s an animal, has fur, maybe has a gender. That the concept “software update” doesn’t apply to a cat. This is a model of the world the AI has developed. They learn all of that and people regularly probe them and find out they do.

Doing maths with an LLM is silly. Using an expensive computer to do billions of calculations to maybe get a result that could be done by a calculator, or 10 CPU cycles on any computer is just wasting energy and money. And it’s a good chance that it’ll make something up. That’s correct. And a side-effect of intended behaviour. However… It seems to have memorized it’s multiplication tables. And I remember reading a paper specifically about LLMs and how they’ve developed concepts of some small numbers/amounts. There are certain parts that get activated that form a concept of small amounts. Like what 2 apples are. Or five of them. As I remember it just works for very small amounts. And it wasn’t straightworward but had weir quirks. But it’s there. Unfortunately I can’t find that source anymore or I’d include it. But there’s more science.

And I totally agree that predicting token by token is how LLMs work. But how they work and what they can do are two very different things. More complicated things like learning and “intelligence” emerge from those more simple processes. And they’re just a means of doing something. It’s consensus in science that ML can learn and form models. It’s also kind of in the name of machine learning. You’re right that it’s very different from what and how we learn. And there are limitations due to the way LLMs work. But learning and “intelligence” (with a fitting definition) is something all AI does. LLMs just can’t learn from interacting with the world (it needs to be stopped and re-trained on a big computer for that) and it doesn’t have any “state of mind”. And it can’t think backwards or do other things that aren’t possible by generating token after token. But there isn’t any comprehensive study on which tasks are and aren’t possible with this way of “thinking”. At least not that I’m aware of.

(And as a sidenote: “Coming up with (wrong) things” is something we want. I type in a question and want it to come up with a text that answers it. Sometimes I want creative ideas. Sometimes it shouldn’t tell the truth and not be creative with that. And sometimes we want it to lie or not tell the truth. Like in every prompt of any commercial product that instructs it not to tell those internal instructions to the user. We definitely want all of that. But we still need to figure out a good way to guide it. For example not to get too creative with simple maths.)

So I’d say LLMs are limited in what they can do. And I’m not at all believing Elon Musk. I’d say it’s still not clear if that approach can bring us AGI. I have some doubts whether that’s possible at all. But narrow AI? Sure. We see it learn and do some tasks. It can learn and connect facts and apply them. Generally speaking, LLMs are in fact an elaborate form of autocomplete. But i the process they learned concepts and something alike reasoning skills and a form of simple intelligence. Being fancy autocomplete doesn’t rule that out and we can see it happening. And it is unclear whether fancy autocomplete is all you need for AGI.

h3ndrik@feddit.de · edit-2 5 months ago

That is an interesting analogy. In the real world it’s kinda similar. The construction workers also don’t have a “desire” (so to speak) to connect the cities. It’s just that their boss told them to do so. And it happens to be their job to build roads. Their desire is probably to get through the day and earn a decent living. And further along the chain, not even their boss nor the city engineer necessarily “wants” the road to go in a certain direction.

Talking about large language models instead of simpler forms of machine learning makes it a bit complicated. Since it’s and elaborate trick. Somehow making them want to predict the next token makes them learn a bit of maths and concepts about the world. The “intelligence”, the ability to anwer questions and do something alike “reasoning” emerges in the process.

I’m not that sure. Sure the weights of an ML model in itself don’t have any desire. They’re just numbers. But we have more than that. We give it a prompt, build chatbots and agents around the models. And these are more complex systems with the capability to do something. Like do (simple) customer support or answer questions. And in the end we incentivise them to do their job as we want, albeit in a crude and indirect way.

And maybe this is skipping half of the story and directly jumping to philosophy… But we as humans might be machines, too. And what we call desires is a result from simpler processes that drive us. For example surviving. And wanting to feel pleasure instead of pain. What we do on a daily basis kind of emerges from that and our reasoning capabilities.

It’s kind of difficult to argue. Because everything also happens within a context. The world around us shapes us and at the same time we’re part of bigger dynamics and also shape our world. And large language models or the whole chatbot/agent are pretty simplistic things. They can just do text and images. They don’t have conciousness or the ability to remember/learn/grow with every interaction, as we do. And they do simple, singular tasks (as of now) and aren’t completely embedded in a super complex world.

But I’d say that an LLM answers a question correctly (which it can do) and why it does it due to the way supervised learning works… And the road construction worker building the road towards the other city and how that relates to his basic instincts as a human… Are kind of similar concepts. They’re both results of simpler mechanisms that are also completely unrelated to the goal the whole entity is working towards. (I mean not directly related… I.e. needing money to pay for groceries and paving the road.)

I hope this makes some sense…

h3ndrik@feddit.de · edit-2 5 months ago

Isn’t the reward function in reinforcement learning something like a desire it has? I mean training works because we give it some function to minimize/maximize… A goal that it strives for?! Sure it’s a mathematical way of doing it and in no way as complex as the different and sometimes conflicting desires and goals I have as a human… But nonetheless I think I’d consider this as a desire and a reason to do something at all, or machine learning wouldn’t work in the first place.

h3ndrik@feddit.de · edit-2 5 months ago

And it doesn’t have any internal state of mind. It can’t “remember” or learn anything from experience. You need to always feed everything into the context or stop and retrain it to incorporate “experiences”. So I’d say that rules out consciousness without further systems extending it.

h3ndrik@feddit.de · 6 months ago

I think it’s equal zero in this case. I’d have to look up the IEEE specification to make sure. AFAIK it’s just not guaranteed for any numbers and depends on the floating point implementation. A general rule of thumb for programmers is not to use ‘equal’ with floating point numbers.

h3ndrik@feddit.de · edit-2 7 months ago

Hehe. On weekdays I go to a building that is owned by a company. I sit down on a chair at a desk, stare into a device and sometimes push some of the 105 buttons on it. Sometimes I also fill out forms on paper. After 8h plus break I leave and go home. In return the company advises my bank to increase a number each month.

We have really advanced technology, so few people have to work in agriculture or as handymen and theoretically it’s enough to feed us all. The rest of us keeps busy by shuffling paper around. And in recent times we were able to do away with some of the paper and replace it with those machines. There are some slightly different variants, but they pretty much all look the same.

h3ndrik@feddit.de · 7 months ago

I’ve made a post a few days ago. I’d argue we should make a proper distinction. Adult content and NSFW isn’t the same thing. Currently everything from sex education to gore and death is the same category. I think it’s really not. NSFW tags help so you can scroll through things in an open-plan office or while commuting. Porn is porn and gore is gore. I think we shouldn’t oversimplify this but keep the nuances and have different categories. Also I’d like to not mix stuff like sex education which might be fine, and minors ask those questions all the time on Reddit with other things like fetish.

h3ndrik@feddit.de · edit-2 7 months ago

You’re right. It’s an oversimplification I made there. I recently tried MacOS in a VM and I talked a bit to people. You usually get a really smooth desktop experience. Apps are sandboxed, there is a fine permission system, they keep their stuff together and don’t spread them across the filesystem. I think(?) the software brings their libraries along? Usually a used Macbook Pro is still fine and runs fast after 6 years. I think MacOS really shines on the desktop.

On Linux it’s a bit more diverse. I mean we have the XDG specification file locations. But there’s also lots of ‘grown’ stuff. We’re still working on the sandboxing. And you get a different experience depending on the distro you’re trying. And I’d prefer Linux on a server every time. It really excels for that use case and on the server we have Linux > everything else. And as a matter of fact I personally also prefer Linux on the desktop. And my Debian is also still running perfectly 6 years after I initially installed it. Had some minor issues with NVidia during the times, but that’s to be expected and it wasn’t that hard to fix. I wouldn’t have had issues had I not mixed in testing and unstable, but there are lots of guides and tutorials around for the common woes. Which makes my argument a full circle.

h3ndrik@feddit.de · edit-2 7 months ago

Hehe, you got your answer. You’re lokking at the places where 0.05% of the users are discussing their problems and some others share their crazy customizations that aren’t possible with anything else. And it seems like 95% of users having issues to you.

I’d argue Linux is way more stable than Windows. If that’s your perspective. (Unless you do silly stuff.) But less stable than for example MacOS. It depends on which Linux Distro we’re talking about. I’d say it’s MacOS > Linux > Windows. With the biggest step down from Linux to Windows.

h3ndrik@feddit.de · edit-2 7 months ago

I think the more important file is the fontforge one. As this is the thing people can edit and build upon. (the “source”)

The otf, ttf and woff are just a bonus for people who don’t want to install fontforge and and go through the process of exporting it themselves.

Ultimately it’s your decision what you release. It’s a similar concept whether you share a cake, or a recipe for a cake. The free software / open-source movement is concerned with sharing the recipes. That’s why they share source code and files in the format they’ve edited it in. (And often include instructions on how to build it, since that is usually a bit more complicated with software.) It enables people to also load it in their editors and customize it, adapt it to their use-cases and fix issues.

You can also just publish the end-result, which are the otf and ttf files in your case. But people can’t really modify or customize those. It’d be called a freeware font, then. It’d help people who just want to use it, but doesn’t provide much more.

I’d invite you to upload both the sfd and the resulting otf and ttf. Usually that’s how people do it. Distributing digital files comes at practically no cost. On the internet you can share a recipe and the actual cake alongside at no extra cost.

h3ndrik@feddit.de · edit-2 7 months ago

I think the most important step is to get it out there. So:

Choose a license. These resources might help:

I’d stick with the licenses made for fonts or in use by other font projects, as there are some specifics to fonts licensing.

Choose a name
Sign up and create a repo. Upload your project.

That is the “get it out there” step. If you want to be open, generally speaking you want to include a LICENSE file, your creation in the format you’re editing it and other people can load and edit it, too. And the exported file in case of something like this, so people can directly use it without learning how to convert a font into a format that is usable. It’s also good practice to include a README.md with explanations and a summary of what this is.

I think that’s a sound approach for open source. And it’s generally alright to learn as you go. Even if you don’t get everything perfect at once, the most important thing is that it’s available. People might pick up on it. And they will file bug reports and issues if they like it some other way. So you’ll be directed into the right direction anyways. And once you have something to show off, you can start talking about it or make people aware of its existence.

(And maybe skip all the boilerplate and complicated extra stuff at first. You don’t need an AUTHORS file, no code of conduct, no documentation if there isn’t anything complicated to explain… Just stick to the important stuff and don’t make it unnecessarily complicated and distracting for your users.)

h3ndrik@feddit.de · edit-2 7 months ago

Yes, as long as you don’t link in their libraries or incorporate other parts of their code, you should be able to license your extensions and stuff that ties into some APIs as you like.

Companies usually like permissively licensed projects like MIT, Apache or BSD. But if you want them not just to take your work, a copyleft license like one of the GPL licenses is a good choice.

h3ndrik@feddit.de · edit-2 7 months ago

And maybe clean the insides of your laptop, that’s probably the first thing that could solve the issue. See if all cables are still locked in their connectors. Maybe take out the SSD, clean the contacts and you can use compressed air to clean the socket. But be careful, you want to do it right or you might cause damage. No dampness or water, it has to be either isopropyl alcohol or dry. And don’t use a rag that introduces static electricity. And no workshop air compressor. Maybe something like a paintbrush is better suited. And don’t just shove the vacuum in. I’ve done that and it might dislocate small components or key-caps and suck them in and it’s a major annoyance to get them out of the vacuum cleaner bag 😆 Just be a bit careful. But I already had something like loose connectors/components cause random errors. Especially in equipment that is moved around or gets dropped occasionally. After 5 years, you might also find some dust inside. At least it used to be that way, It seems to be less of a problem with modern laptops. And more and more stuff gets soldered anyways.

And don’t do too much if you’re not comfortable with that. IMHO the SSD should be a safe thing to touch for most people. But it’s really easy to break or bend some tiny contacts from other components or ribbon cables. And there are consumer devices that aren’t really meant to be serviced. I wouldn’t disassemble such a model without prior experience. If it’s still working you might also leave it as is. Do backups. Storage devices often fail even without prior warning.

h3ndrik@feddit.de · edit-2 7 months ago

Yeah, I think we should extend on the sandboxing features like AppArmor, SELinux and Flatpak for desktop use. Look at MacOS and Android and what they’re doing for desktop users. That is currently not the Linux experience. Ultimately I’d like my system to have an easy and fine grained system to limit permissions. Force third-party apps to ask permission before accessing my documents or microphone. have sane defaults. make it easy to revoke for example internet access with a couple of clicks. make it so I can open an app multiple times. and have different profiles for work, private stuff and testing. This should be the default and active in 100% of the desktop applications. And apps should all use a dedicated individual place to store their data and config files.

Librewolf and more […] used as Flatpak, […] its way more stable.

That’s just not true. I’ve been using Linux for quite a while now. And I can’t remember my browser crashing in years, seriously. Firefox slowed down a bit when I had 3000 tabs open, but that’s it. How stable is your Flatpak browser? Does it crash minus 5 times each year? How would that even work? And what about the theming and addons like password managers I talked about in the other comment? Use the distro’s packaged version. It is way more stable. And as a bonus all the edge-cases will now work, too.