AI Safety – Zraix

Ok, so I lost my job in February, which sucked. But I try making the best of it – I’ve been interested in AI Safety for a long time and this might be the best opportunity I’ll get to try and get into it.

So far, I’ve

Attended intensive BlueDot’s Technical AI Safety course
- It was a great kick-off for this endeavor really – forced me to spend many hours reading up on the field and engaging with the arguments and start writing down my own ideas and thoughts on it. It was very intense, and I’m not used to reading that much really technical text in such a short time but I think it was the kick in the butt I needed to get a running start. I highly recommend it to anyone who wants to get into this and start off with a broad overview of the technical side of the issue.
Started doing the ARENA program on mechanistic interpretability
- As I suspected, since I’m going through it without any social pressure, it’s taking it’s sweet time ofc, but it’s surprisingly fun! It’s very difficult, I haven’t really learned new math in a very long time and considering how exhausted I am after some of this, I think I might not even have learned anything for a long time x) …
Trying to do a replication of the Toy Models of Superposition paper
- I have also never really delved this deep into a technical research paper before, but it’s going pretty smoothly, especially with the help of claude. It’s actually a great feeling when I re-read a part that sounded like gibberish just a minute ago – that I now understand. You can really get hooked on the *ping* feeling of the coin falling down. ^^
Playing with AI models
- So a bit more on the fun side, ofc. But you can’t just think about how it’s gonna destroy the world and how capable it is without also wanting to play with it a bit :P. Ofc, I don’t think it’s the current models that are going to lead to our extinction also, so it’s ok xP
- AI models playing Baldurs Gate 3 (or any point and click game really) on a Steamdeck.
Starting a Substack
- Where I can collect all of the stuff I’m, hopefully, going to see through writing about AI. I have some thoughts on a few articles already, but I’m also a bit nervous since last time I wrote articles was back when I was a politician and I don’t remember that as being all that pleasant. I also haven’t grown to like online debates more since then so I’ll have to deal with that somehow. I’ll try an auto-post the articles from there to here as well – but I don’t imagine there are any followers here that would be sad if I didn’t haha x)
Joined the PauseAI Germany
- Went to two events so far at least, but those have been fun! I don’t know how effective the messaging is, but at least great way to meet people who are on board with the threat already for socializing, as some of my friends are still very much not there unfortunately.
- Also attending regular co-working outside the Foresight offices ^^, not entirely sure for what purpose but I guess networking, which isn’t really my strong suite, but I’ll have to keep at it I guess, trying to land a job in the space.