summaryrefslogtreecommitdiff
path: root/content/blog
diff options
context:
space:
mode:
Diffstat (limited to 'content/blog')
-rw-r--r--content/blog/_index.md3
-rw-r--r--content/blog/anthropic.md229
-rw-r--r--content/blog/bsky.md100
-rw-r--r--content/blog/federations.md85
-rw-r--r--content/blog/thoughtleaders.md56
5 files changed, 473 insertions, 0 deletions
diff --git a/content/blog/_index.md b/content/blog/_index.md
new file mode 100644
index 0000000..0a8829f
--- /dev/null
+++ b/content/blog/_index.md
@@ -0,0 +1,3 @@
+---
+title: blog
+---
diff --git a/content/blog/anthropic.md b/content/blog/anthropic.md
new file mode 100644
index 0000000..0fc8a24
--- /dev/null
+++ b/content/blog/anthropic.md
@@ -0,0 +1,229 @@
+---
+title: Amodei's goons are cowards
+date: 2025-06-29
+showdate: true
+bookToC: false
+tags: [rant, legal]
+---
+
+I received a notification from my VPS provider that my VPS was running at more than 90% CPU utilization, and when I checked the `nginx` logs I saw the following. I was effectively being DDOS'd by Dario's (and his troupe of researchers)[^1] ClaudeBot.
+<!--more-->
+---
+```
+# ... more before
+18.221.167.11 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.144.89.42 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.22.70.169 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+18.117.154.134 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+18.117.172.189 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+18.223.195.127 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.147.48.105 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+13.58.61.197 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.21.46.68 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.15.149.24 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+18.224.54.61 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+18.118.32.7 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.145.7.187 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.133.137.10 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.147.86.143 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+18.118.144.109 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+3.135.206.25 ... (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)"
+# ... more after
+```
+---
+Because I host my own git server, I am in essence at the mercy of the crawlers. In any case this, followed by another DDOS attack by Dario's team some time later, finally gave me enough determination to use [`ai.robots.txt`](https://github.com/ai-robots-txt/ai.robots.txt), which sets up an automatic blocker for known ai crawlers.
+
+At the beginning I only used it to generate a `robots.txt`, without the actual blockers on the reverse proxy level, hoping that the different crawlers would respect it.
+
+The more pressing matter however was the fact that I had used a custom license. One that is not `free and open source`, but rather, my own, anti-crawler license.
+
+To that extent I tried to reach out to Dario's legal team, to notify them of my license, so that they wouldn't use my data to train their model (license details will be coming up).
+
+Since I am just "a guy", I decided that the best option was to reach out via LinkedIn to a member of their legal team. I found another "a guy", and I had the following conversation
+
+---
+>Nov 25, 2024
+>sent the following message at 5:40 PM
+>
+>Good afternoon,
+>
+>Anthropic's ClaudeBot recently crawled my web server and scraped my public git repositories hosted on my web server (I can send the server traffic logs if you wish).
+>
+>My code is licensed under a custom license, the DBEL, which has non-commercial provisions about the software (as defined in the license), especially for use in AI training.
+>
+>Specifically, clauses 1-4 read:
+>
+>Don’t Be Evil License (DBEL) 1.0
+>
+>1. Acceptance
+>
+>By using, copying, modifying, or distributing the source code, training data, training environment, or its associated machine learning model weights (collectively the “Software”), you agree to comply with all terms outlined in this license.
+>
+>2. Copyright License
+>
+>The Licensor (defined below) grants you a non-exclusive, worldwide, royalty-free, non-sublicensable, non-transferable license to use, copy, modify, and distribute the Software, including associated model weights, training data, and training environments, subject to the conditions set forth in this license.
+>This includes the right to create and distribute derivative works of the Software, provided that the limitations below are observed.
+>
+>3. Non-Commercial Use Only
+>
+>You may use, copy, modify, and distribute the Software and derivative works solely for non-commercial purposes.
+>Non-commercial purposes include, but are not limited to:
+>
+>Personal research and study.
+>Educational and academic projects.
+>Public knowledge and hobby projects
+>Religious observance.
+>Non-commercial research, or AI and machine learning (ML) experimentation.
+>
+>4. Distribution and Monetization Provisions
+>
+>Any use of the Software or derivative works for profit, or in a business context, including in monetized services and products, requries explicit, seperate permission from the Licensor.
+>The restrictions on commercial use apply to both the source code and any model weights produced by the Software.
+>
+>Any distribution must include this license, and the non-commercial restriction must be maintained. Weights resulting from use of the Software, including but not limited to training or fine-tuning models, must be shared under this same license, ensuring all restrictions and conditions are preserved.
+>[...]
+>
+>And in clause 12 you can find the definitions:
+>“Licensor”: The entity or individual offering the Licensed Materials under this license.
+>
+>“Licensed Materials”: The software, source code, training data, training environment, model weights, and any associated AI/ML components provided under this license.
+>
+>“You”: The individual or entity accepting the terms of this license, including any organization or entity that this individual or entity might work for or represent, including any entities under common control.
+>
+>“Your license”: The license granted to you for the software under this terms.
+>
+>“Model weights”: The machine learning model parameters generated by training or fine-tuning models using the Licensed Materials.
+>
+>“Use”: Anything you do with the software requiring your license
+>
+>As Anthropic is a for profit company, that crawls the web with the intent of scraping data to train their commercial models on, this is a violation of the DBEL.
+>
+>Furthermore the resulting model from the training must also have the DBEL, as per clause 4, which gives me legal rights on how any LLM generated by my data is used, distributed, or monetized.
+>
+>Since there is no easily findable way to communicate with the Anthropic legal team, I am reaching out to you to notify them to either delete all DBEL licensed code, or to reach out to discuss with my lawyer how to best move forward.
+>
+>Thank you very much,
+
+>>sent the following message at 10:43 PM
+>>Good afternoon, I would appreciate it if you would send an email to legal@anthropic.com. I’m on vacation this week, and this is the fastest path to the correct people at Anthropic.
+
+>Nov 26, 2024
+>sent the following messages at 1:51 AM
+>Thank you very much, have a good rest of the vacations
+
+---
+Things went quiet after applying `robots.txt`, at least for a little while; Dario's crew does seem to respect it, and being myself, I decided to let the whole thing pass, at least for the time being. In truth, my license is about commerical use of models trained on my data, and I didn't have any proof that my data wasn't filtered out inbetween the crawling and the training.
+
+So I set up a honeypot repo, which is full of a specific phrase that is unique, so that if any LLMs do scrape my git repo and they do train on my data, I could easily prove that they did it.[^2]
+
+After a couple of months, where I hadn't updated my setup, I got DDOS'd again, this time by the combined might of Zuckerberg and his ilk, as well as Bezos' minions, both sending bots with instructions to completely disregard the `robots.txt`, and hammer at the poor VPS I have set up.
+
+I checked Zuckerberg's and Jassy's legal, as I naively assumed that they would have an email much like Dario, but to my suprise, I realised that they maliciously expect you to send physical mail to an address in Ireland[^3] (Zuckerberg, I couldn't find any contact info for Jessy's folk, probably beacuse I didn't look enough).
+
+In any case. Yann LeCun tends to make the models that come out of his department weight-available (not to be confused with open-weight or open-source) which I am ok with, at least to the extent that if they do train their model on my data, at least I get to run the model for myself.
+
+Concerning the humans that work for Andy Jassy, I never heard of them, nor did I know that they do AI research/models, but it made sense; Alexa is bad, they want to improve it, so they want more data to go from a symbolic ai model to an llm. I had heard also that they wanted to use Amodei's crew to contribute to the research, so I decided to be more thoughfull; I would send an email to Dario's legal team.
+
+---
+Topic: Possible violation of software license
+
+>Hello,
+>
+>I noticed that AmazonBot crawled my git server at https://erga.apotheke.earth a couple of days ago. I’m reaching out because I understand that Amazon provides training data to your systems, and I want to give you a heads up regarding my projects.
+>
+>My repositories are licensed under two custom licenses—the Don't Be Evil License 1.0 (DBEL 1.0) and the Don't Be Evil License 1.1 (DBEL 1.1). Both licenses include explicit provisions restricting the use of the repository contents for training data or any commercial application without meeting the stated terms.
+>
+>For example, in DBEL 1.0:
+>
+> • Section 3 ("Non-Commercial Use Only") restricts use to non-commercial purposes.
+>
+> • Section 4 ("Distribution and Monetization Provisions") specifically states that:
+>
+> Any use of the Software or derivative works for profit, or in a business context, including in monetized services and products, requires explicit, separate permission from the Licensor. The restrictions on commercial use apply to both the source code and any model weights produced by the Software.
+>
+> Weights resulting from use of the Software, including but not limited to training or fine-tuning models, must be shared under this same license, ensuring all restrictions and conditions are preserved.
+>
+>Similarly, DBEL 1.1 includes analogous requirements, particularly in its provisions regarding commercial use and the need for fair compensation if your systems use the Software (or derivatives thereof) in a commercial context.
+>
+>You can review the full licenses here:
+>
+> • DBEL 1.0: https://erga.apotheke.earth/aethrvmn/protest/src/branch/master/LICENSE
+>
+> • DBEL 1.1: https://erga.apotheke.earth/aethrvmn/alectors/src/branch/master/LICENSE [^4]
+
+>Given these terms, I kindly request that you review your data collection practices to ensure compliance with my licenses. If my work is being used for training or any commercial applications without the appropriate permissions or compensation, I would like to discuss how we can address this situation.
+>
+>Thank you for your attention to this matter.
+>Sincerely,
+
+
+>>Dear Mr.
+>>Thank you for your April 3, 2025 email. We understand that the activity at issue relates to AmazonBot. We do not control the AmazonBot crawler or receive crawl data from Amazon for training. Accordingly, we will consider this inquiry resolved.
+>>Sincerely,
+
+>Thank you for the response.
+>I also have records of ClaudeBot DDOSing my server on the 24th and 28th of November. Back then only the DBEL 1.0 was used, so if Anthropic has trained its models without filtering said data/code the DBEL 1.0 applies.
+>Yours,
+
+>>Thank you for letting us know about traffic related to Anthropic’s web crawling. Anthropic aims to limit the impact of our crawling on website operators. We respect industry standard robots.txt instructions, including any disallows for the CCBot User-Agent (we use ClaudeBot as our UAT. Documentation is available at https://anthropic.com/crawl). Our crawler also respects anti-circumvention technologies and does not attempt to bypass CAPTCHAs or logins. Please note also that claudebot@anthropic.com is the correct point of contact for this bot going forward, as is documented in every request as part of the user agent string.
+>>I understand that you've disallowed our web-crawler ClaudeBot from accessing https://erga.apotheke.earth, so we should not be crawling it or training on data from it. Can you share more information about the activity you're seeing? If you have logs, I can share that with our team to investigate further.
+>>Best regards,
+
+>Good morning,
+>
+>These messages have not been about informing you about ClaudeBot, they have been about your possible violations of my copyright license.
+>
+>If you have used DBEL 1.0 licensed code to train any ML model, (which is possible because of the indication of ClaudeBot in my logs), then I have commercial rights to the resulting ML model, and it must also carry the DBEL 1.0.
+>
+>4. Distribution and Monetization Provisions
+>Any use of the Software or derivative works for profit, or in a business context, including in monetized services and products, requries explicit, seperate permission from the Licensor. The restrictions on commercial use apply to both the source code and any model weights produced by the Software. Any distribution must include this license, and the non-commercial restriction must be maintained. Weights resulting from use of the Software, including but not limited to training or fine-tuning models, must be shared under this same license, ensuring all restrictions and conditions are preserved.
+
+>This would mean that Anthropic would need permission from me personally in order to use any of their LLMs, that have been trained on my code, in a commercial setting.
+>Therefore I have been sending these emails to inform you that in case you do train a model using my code, you have accepted this License and therefore agree to come to contract with me concerning the commercial use of the model, as per Clause 1.
+>Once again, you can find DBEL 1.0 here:
+>https://erga.apotheke.earth/aethrvmn/protest/src/branch/master/LICENSE
+
+
+>>Thank you. We disagree with your interpretation of this hypothetical scenario. In any event, I can confirm that your site is not in our training corpus so we will consider this matter closed and will not be responding to further communications. Anthropic reserves all rights.
+
+
+>Good afternoon,
+>
+>My "hypothetical scenario" is based solely on the information provided by Anthropic at <https://support.anthropic.com/en/articles/8896518-does-anthropic-crawl-data-from-the-web-and-how-can-site-owners-block-the-crawler>
+>
+>"
+>ClaudeBot
+>ClaudeBot helps enhance the utility and safety of our generative AI models by collecting web content that could potentially contribute to their training.
+>"
+>
+>Given the fact that ClaudeBot appears in my server logs, it would then make sense to be alarmed at the idea of Anthropic using my data for training purposes.
+>
+>
+>Since you only disagree with my interpretation of the following facts:
+>
+>that Anthropic is using ClaudeBot to crawl websites for information to be used in training,
+>
+>that Anthropic crawled my server with ClaudeBot,
+>
+>that my software was under a restrictive license,
+>
+>and that said license, which you accept upon using the Software as per Clause 1, has special clauses regarding the use of the Software as training data.
+>
+>
+>Would you be willing to point out the error in my interpretation that the license should hold?
+>
+>Obviously crawling a server to collect training data and actually using the data for training are two seperate processes, however there is no way for me to know whether my data has been filtered out or not by your internal processes, hence this discussion.
+>
+>Would you be willing to sign a declaration that Anthropic has not used the data, that they have acquired from my server, for training purposes?
+>
+>Yours,
+>Vasileios Valatsos.
+---
+at which point the response was an 'Out Of Office' template reply.
+
+Cowards.
+
+[^1]: I do not recognize metaphysical entities such as "corporations". They are human-made constructs in order to hide blame. Equivalent to golems in Hebrew mysticism, or the Kapparot. The humans that control them are the actors and the offenders.
+[^2]: Note that the reason for such an aggressive license is not to make me money. I just want them to leave me alone, hence why I informed them.
+[^3]: I am a citizen of an EU state, and the European HQ for most US megacorps are in Ireland.
+[^4]: This has since been moved to GPLv3+ and a new repo set up.
diff --git a/content/blog/bsky.md b/content/blog/bsky.md
new file mode 100644
index 0000000..245ec98
--- /dev/null
+++ b/content/blog/bsky.md
@@ -0,0 +1,100 @@
+---
+date: 2024-11-15
+title: bsky.social & data non-privacy
+showdate: true
+tags: ['privacy', 'media']
+---
+Alright, that was a cool two days of self-hosting an atproto server and having a bsky account. [Back to mastodon](https://sigmoid.social/@aethrvmn). On the way back, i want to open a discussion
+<!--more-->
+
+## how atproto works
+
+atproto is the protocol on top of which apps like bsky and whtwnd are built upon. it offers a uniquely interesting idea of being the underlying network on top of which apps are built, like instances, each capable of showing the unique combination of data that they are built for.
+
+This works because each user has a single unique identifier across the network, which resides alongside all their other data inside the Personal Data Server (PDS). Since all of a user's data live inside of the PDS, making an account in one of the platforms that reside on top of atproto means that you effectively have an account on the entirety of the atproto network, so bluesky, whitewind, an instagram clone. One account, every application[^1].
+
+atproto is built to be federated, so that every account from every PDS is able to interact with every other account from every other PDS, some of which (like the one I made for two days) are self-hosted.
+
+Other federated projects, like ActivityPub, ensure this by making the API endpoints of different services identical, so for example a video upload on PeerTube or an image post on PixelFed are considered to be identical to a post on Mastodon, and can be viwed and interacted by people on Mastodon. You still need separate accounts for each platform in order to post, but you can interact with posts AND users from everywhere.
+
+## the problem with privacy
+
+atproto achieves this interoperability by making every data point of every user completely public. This means that every post, any interaction, every follow, are completely public, not only through the appliaction (so for example how you can view who your friends follow on instagram), but on the server side too.
+
+to quote this [blog post from Feb 2024](https://steveklabnik.com/writing/how-does-bluesky-work)
+> atproto is for connecting to others, so it’s focused on social applications. It also is currently 100% public, there are no private messages or similar. The reasons for this is that achieving private things in a federated system is very tricky, and they would rather get it right than ship something with serious caveats. Best for now to only use this stuff for things you want to be public.
+
+Let's show what I've been saying let's use the atproto developer's profile: [@atproto.com](https://bsky.app/profile/atproto.com). Obviously, the information here is open and accessible; this is a good thing, people should be able to see what you post publicly, in the 'town square' that bsky wants to promote. Another way to see the data is to use an atproto explorer like [the one made by Tom from frontapge](https://atproto-browser.vercel.app/at/atproto.com).
+
+This, as of the writing of this post, has the following
+
+```
+PDS Collections
+---
+app.bsky.actor.profile
+app.bsky.feed.generator
+app.bsky.feed.like
+app.bsky.feed.post
+app.bsky.feed.repost
+app.bsky.graph.follow
+app.bsky.graph.list
+app.bsky.graph.listitem
+app.bsky.graph.starterpack
+chat.bsky.actor.declaration
+```
+
+Now, by their names we understand what each is, but the reader can spend as much time to familiarize themselves as they want. Suffice to say that this makes scraping a lot easier, since everything is stored in unencrypted plaintext JSON.
+
+
+This openess would make it 'vulnerable to data scraping', if it wasn't advertised as a feature, and as a core principle of bsky and atproto in general.
+
+
+## malicious actors
+
+Let's consider the view of a malicious actor, say a government agency, or a malicious individual or organization. What does atproto offer for them?
+
+- *Full network scraping*, since anybody has access to the entire atproto network, collecting data from all PDS servers. This information includes
+ - profiling, identifying information,
+ - posts,
+ - replies,
+ - likes,
+ - location,
+ - interests,
+ - social behaviour,
+ - daily patterns
+ - much more that people with more time can figure out.
+
+- *Individual tracking and profiling*. Since every user's entire, unique history of interactions are stored in an easy to find/easy to read spot, tracing the digital fingerprint of a specific user across all platforms of the network is trivial.
+
+## misuse of data
+
+With the predescribed access offered to anybody by the atproto network, I want to suggest how the data can be used for wrongdoing. Most people probably have some ideas of how it can be misused, but here are some examples, to really drive home the dangers that this openess enables
+
+- Stalking and Harassment
+
+ Your abusive ex-anybody has complete and full access to your entire history across all networks. Blocking them? Not while atproto offers full access to your data.
+
+- Specific people of interest.
+
+ If you are a refugee, or an activist, or a whistleblower, fleeing an opressive regime, or a genocide, or fighting for your freedom, your rights, or those of others, atproto offers a single point of access to all of your activities, to any one interested. If you are a public figure, then again, _**everything**_ you do in the _**entire** atproto network is **public**_.
+
+- Corporate data collection
+
+ It is probably a known fact that corporations try and collect as much data from their users as possible, either to create [dynamic prices in supermarkets](https://www.startpage.com/search?q=electronic+shelf+labels+(ESLs)+dynamic+pricing), or to create [dynamic insurance prices based on lifestyle](https://www.startpage.com/search?q=dynamic%20pricing%20for%20insurance), or to [feed even more data to the insatiable ai overlords](https://www.startpage.com/search?q=ai%20companies%20running%20out%20of%20data)
+
+- Employer Surveilance
+
+ Employers 'might' (almost surely) scrape data to evaluate current or potential employees, raising ethical and privacy concerns about the workplace.
+
+## the core issue
+
+All of the above point to the same problem that all data collection does, which is the lack of **user consent** and **user control of the data**.
+
+Because of how the atproto network is structured, it inherently exposes and broadcasts data to the entirety of the internet, in a way that **users are not able to opt out of**.
+
+While other platforms face exteme (and extremely well deserved) scrutiny for data handling and data privacy, bsky uniquelly seems to have sidestepped the conversation, and to instead focus on marketing itself as an *open source* alternative to Twitter/X, hoping to lure users since there is an association between open source and non-profit, non-corporate, private, of which bsky and atproto are none of.
+
+
+
+
+[^1]: the physicist in me likes to think about this in the sense of how particles are excitations of quantum fields, applications like bsky or whtwnd are excitations of the underlying atproto network.
diff --git a/content/blog/federations.md b/content/blog/federations.md
new file mode 100644
index 0000000..b090504
--- /dev/null
+++ b/content/blog/federations.md
@@ -0,0 +1,85 @@
+---
+date: 2024-12-01
+title: the drawbacks of federating
+showdate: true
+tags: ['media', 'privacy']
+---
+Essentially the past couple of weeks I have been looking at the three main Twitter/X alternatives, BlueSky and ATProto, Mastodon and ActivityPub, and NOSTR, and specifically trying to understand how they work and which I should use via self hosting.
+<!--more-->
+
+## the dilemma
+
+I dont really use social media since the government-mandated COVID quarantines of 2020-2022, but I have been trying to find a place for myself, media wise, in the past months (circa Feb 2024). To do this there was a necessity to go with the main stream, which in the case of my professional interests, having to do with AI and privacy, meant Twitter/X and LinkedIn.
+
+I had known Twitter/X for a long time but never used it since it was always an "American" thing that we never used or cared about, but reluctantly I made an account around September 2024. Previously I had heard about Mastodon, and the ActivityPub protocol, which promised a more private, better experience, without algorithms, monetizations, etc.
+
+After the Brazilian government banned it's citizens from accessing Twitter/X, I heard about another alternative, BlueSky. This one also promised to be a better experience, but rather that talking about privacy, it focused on openness.
+
+Recently I wrote [a blog post on BlueSky](/blog/bsky) where I went into detail about the things that I consider as a no-use for myself, and for anybody that respects their privacy online. After that, I talked with some friends who are also a bit privacy inclined and they told me to take a look at Nostr as that might be what I wanted.[^1]
+
+Essentially the past couple of weeks I have been looking at the three main Twitter/X alternatives, BlueSky and ATProto, Mastodon and ActivityPub, and NOSTR, and specifically trying to understand how they work and why which I should use via self hosting.
+
+What I found out is that in order to use a decentralised or federated social media platform you need to make data public, or you need to find a clean way to transfer data between different servers. In a proprietary, centralised system, such as Twitter/X, you can do this in an obfuscated way, since you control the entire stack, so the frontend and the backend are tightly knit so that no API endpoints come out and God forbid people use them to interact with each other.
+
+Since federation/decentralisation mean that servers (in the case of federation) or devices (in the case of decentralisation) needs the different backends to interact and keep in sync over the public internet, things become a lot harder. Interestingly, all three options (BlueSky, Mastodon, Nostr) have tackled the same issue in different ways, each with their own drawback. Since I have already talked about [BlueSky on a different blog post](/blog/bsky), I will briefely mention the drawbacks and sacrifices it had to make before I move on to the other protocols
+
+## atproto
+
+BlueSky works by making use of PDSes (Personal Data Servers), which store the entirety of a person's interaction history across the entire network. This means that if there was an Instagram clone, you wouldn't need to sign up to use it, you would already have an account, since when you signed to [bsky.social](https://bsky.social), you didn't sign up to [bsky.app](https://bsky.app) but rather you signed up for a PDS stored under the bsky.social, hence the @-handle is @username.bsky.social[^2].
+
+In this sense, BlueSky has done the bare minimum to be considered a federated network; if they block every other PDS from federating with them, they would be Twitter[^3].
+
+Because they opened the atproto though, they had to figure out how to transmit and store data such as followers and likes to other PDSes on the federated network. For example, when I self hosted my own PDS, when I liked a post on bsky.app, that information had to be stored on my PDS, but if someone from the bsky.social PDS opened the post, then there should be a way for them to see that I liked the post in the past.
+
+That way is unencrypted plain text.
+
+Obviously this is very bad. Not only are malicious people and groups able to spy/stalk without anybody's permission or knwoledge, but because the entire interaction history is public, people are not able to assert ownership over their data. Also, since the data is publically available, it is true that BlueSky or any other entity can't sell your data, because the data is free for everyone. For people that are aware of these issues, it can be manageable; the old rule of pre-social media internet rings true.
+
+> Don't share anything on the internet that is personally identifiable. You don't know who the person reading this is.
+
+The rule of thumb of using BlueSky or any atproto app then should be to treat every post as if it is going to be read by a stalker, a foreign government, or by the person or group of people you want to hide from the most.
+
+I guess you can call this the [Jack Dorsey Special™](https://www.theguardian.com/technology/article/2024/may/07/jack-dorsey-quits-bluesky-board-urges-users-stay-elon-musk-x-twitter)
+
+## nostr
+
+Nostr is a protocol that has the advantage over atproto of not having a centralised server. Rather, it uses relays with some of them able to store data and others simply moving traffic. This means that as long as you choose to connect and write to multiple relays, you have multiple, synced backups of your data, and then you effectively are uncensorable, which is what the protocol is all about.
+
+The problem again is that nostr also transmits the data in plaintext JSON, exactly as atproto does. This comes with the added advantage/disadvantage, that since the relays are spread across the network, any data transmitted cannot be deleted *and* is public.
+
+Jack Dorsey you madman, [you've done it again!](https://observer.com/2024/05/jack-dorsey-donation-opensats-open-source-development/)
+
+The distinction to this is that because the network does not use centralised data servers, the data are harder to aggregate and collect. One would need to set a relay and try and convince people to use that relay, and even then people could choose to move away from said relay and the data would stay static and stale, which seems to be better that what atproto provides. This however does not guarantee privacy in any capacity, since just by picking packets in transit you can probably fingerprint users, like how users are fingerprinted anyways online.
+
+The other issue with nostr is that of user experience.
+
+Nostr tends to be favoured by the cryptocurrency community because they are used to the ritualistic behaviours associated with managing and juggling private keys. When you join the network of the nostr you are provided with a private/public key pair. The private key is your identity on the network, with which you sign all interactions to prove that they are indeed made by you. The relays and the clients then can use the public key to verify that the message was indeed sent by the person that it claims to be from. Like a signature.
+
+I mentioned cryptocurrencies because in order to have a wallet that you own and operate, you need to keep the seed phrase secure, since it is the only way to verify that it is indeed your wallet. Likewise if the private key in nostr leaks, anybody can impersonate you and take over your social media, as this is the only method by which you can be verified and identified, and after your keys are leaked, you need to generate a new set, admit that the other account has been lost, and start from scratch, without followers.
+
+This raises another issue. Logging in to a client application requires that there must be an identifying mechanism. The private key would be too dangerous, since it would require an extra step of trust, for which are browser extentions or applications for smartphones to handle that.
+
+This probably means that nostr will never get a big userbase like atproto, since the average user doesn't want to (and shouldn't) have to worry about writting down private keys in pieces of paper for privacy, and then running specific browser extentions or applications in order to verify their identity (Password managers can already be used for these things and they aren't by the vast majority, which proves the above speculation.).
+
+## activitypub
+
+ActivityPub is porbably the more private of the three when it comes to server/backend access and data storage, since it does not use public databases to transmit data, and rather uses a unified API system accross different platforms to facilitate interoperation. The issue of data safety still exists, since anyone can set up their own instance, federate with the broader network and collect data, but that is closer to traditional web scraping that finding an open JSON collection, since platforms and servers can choose to encrypt the backend. Also, since most instances tend to be small and community focused, I would assume it would be easier to notice a malicious entity trying to scrape other servers based on the generated requests, and then either the owner of the compromised server could ban the user, or they could defederate from the server the user is spamming requests at, or the other server's maintainer could defederate from the compromised server.
+
+Also, specifically for Mastodon and other currently used ActivityPub platforms, there is no algorithms or discoverability, although there is an explore feature to see posts from the entire fediverse, or another one to see recent posts from the specific instance that you use as your "home server".
+
+In terms of interoperability between platforms, since every platform uses the same API, there isn't a "prefered app" when it comes to posting media. You can use your mastodon account to log in to a PixelFed (Instagram clone) application and use it as you would use it if you had made the account originally on a PixelFed instance, but the data are stored in the original home server, like BlueSky, but in contrast to BlueSky, these data are not public outside of people using the actual service.
+
+The main gripe I have with ActivityPub is that it duplicates data, so that when two users interact from different servers, each server keeps a record of the interaction, which effectively means that nothing is every deleted. You can request a deletion of data, like a post on Mastodon, or a video on PeerTube, but the other servers can choose to not respect your request. Essentially they appropriate the data and now it is part of their server as much as it is of yours.
+
+A good way to think about this is to think of users interacting by sending emails. Both users store their emails on their inbox, and if one deletes the email, the other one might or might not delete it also.
+
+The other problem is that Mastodon (and by extention ActivityPub since Mastodon is the biggest service running on the protocol) is very political and therefore tend to self censor. Each server can choose to implement their own social contract and terms of use, giving the owners of the server or instance control over both who the users intaract with, via deciding which servers to federate with, and also control what the users say, and like BlueSky and unlike Nostr, the data is server specific. This gives rise to the possibility of echo chambers, which is an issue that can also happen on BlueSky, but not on Nostr, since Nostr relies on the user choosing relays to read and write from, rather than the server owner choosing which content to deliver. To some extent this is unavoidable; there is an inherent tradeoff between privacy and moderation, and for most people moderation and freedom *from* speech is more important that privacy and freedom *of* speech.
+
+Obviously the option to host one's own server is still there, but the issue then becomes playing by the same rules that the main servers play by, which are politically biased, otherwise you might get "Fediblocked" where the server gets put on a blacklist that is shared across the network.
+
+
+[^1]: Obviously Nostr and Mastodon are NOT for networking, but at this point it was more of an exploration out of curiosiy (and despair) than anything else.
+
+[^2]: This reveals another less important (in my opinion) problem with atproto. Since the data are centralised, the owner of the PDS can modify *your* history of interactions without your consent.
+
+[^3]: Since it is a federated network of servers, the server owner can at any time, and for any reason, terminate either the server, your account, or the connection between the server and other servers. Using a centralised PDS means that every non-owner that uses a service is at the whim of the server owner. This has always been a problem with centralised services, but it just highlights that bsky.social is just one step from being a closed ecosystem.
diff --git a/content/blog/thoughtleaders.md b/content/blog/thoughtleaders.md
new file mode 100644
index 0000000..137ebc0
--- /dev/null
+++ b/content/blog/thoughtleaders.md
@@ -0,0 +1,56 @@
+---
+title: i hate ‘though leadership’
+date: 2025-03-18
+showdate: true
+tags: ['rant', '']
+---
+
+There has been a new term that I have seen being thrown around, that of 'thought leaders'. Not only is it an insulting term for everyone else, but it is also a very bad term to describe these people. A better phrase would be 'nth order regurgitator'.
+<!--more-->
+## what is thought leadership
+There have always been people with public appeal and a big following, and in business it is often the case that people will try to find shortcuts to success. Rather than go through the arduous process of studying, learning and understanding, in order to have the extremely unpopular trait of one's own opinion, it is much easier to commit the fallacy of appealing to an authority, absorbing surface level information, and regurgitating opinions.
+
+It's not hard to see why this is the case, especially in industry; learning from scratch is not a guaranteed return on investment, especially if you need to invest a significant amount of time or money to get a good grasp on a specific field or sub-field. Regurgitation is a much safer and low-effort alternative behaviour that can still get you a solid 10% of the way to understanding what's actually going on.
+
+At the same time, people who, supposedly, are experts in their area, enjoy being recognised for their contributions, their efforts on their industry, and they feel like they have gathered insights to share and help others.
+
+## orders of regurgitation
+In my experience on social media, specifically LinkedIn and X/BlueSky, there are two main ways to create and regurgitate content.
+
+The first, the zero order regurgitation (gurgitation?), is the primary sources. These gurgitators often are acknowledged scientists or accomplished businessmen that have a track record of providing value, originally to their customers, but possibly also to shareholders, owners, or even to no one at all[^1].
+
+The second, the first order regurgitation, is about taking in the talking points and ideas of the gurgitators and present their own interpretation of the primary source, while at the same time managing to remove most of the actual knowledge, since it is something that they themselves do not understand most of the time[^2].
+
+## my gripes with the state of things
+There are many issues that I and others have with the status quo.
+
+Personally the main ones are, in order of number of characters for a waterfall effect (aesthetic ordering):
+
+- Appeal to authority is a logical fallacy
+- Regurgitators assume a role of moral authority
+- Personal relations and marketing is not actual insight
+- The use of the word 'leadership' implies a social dynamic
+
+### appeal to authority is a logical fallacy
+It is often the case in fields outside of hard science (and to some extent in the hard sciences also) that the name of the people or organization that present an idea or product are tied to the idea or product itself. This creates a situation where the opinion of these people is perceived to carry more weight, which can be (and oftentimes is) used and abused to present favourable information as factual, irregardless of the actual truth. Even if it is the actual factual truth, the very fact that it is presented after it has been regurgitated detracts from it. It is not yours, you didn't have to think it through, you didn't see the subtleties, you merely read about it.
+
+You can maybe make it yours if you contemplate it, but that's a different discussion for a different mood.
+
+In any case, reliance on authority figures is long known to be a logical fallacy[^3].
+
+### regurgitators assume a role of moral authority
+As regurgitators grow an audience, they feel like they have an obligation to provide moral guidance to the people that listen to them, and try to provide moral guidance. This is stupid. In truth moral authority is given, not taken, and nobody gave the regurgitators the authority to discuss moral matters. Stop trying to tell me how to behave. It's embarrassing.
+
+### personal relations and marketing is not actual insight
+Other terms used for this type of work, such a "content creator" are upfront that there is no benign purpose. "Thought leadership" however implies that the regurgitator is a benign leader, who does this out of the good of their hearts, in a selfless manner. In truth, most of the time it has the opposite intent, and the regurgitators use the persona of kindness to promote their own products and ideas, or those of their friends, or those who pay them.
+
+### the use of the word 'leadership' implies a social dynamic
+The most egregious issue of the term "thought leader" has to do with the presumed "authority". Apart from the fact that appeal to authority is a fallacy, the assumption of the term "thought leader" by the regurgitators, or their "followers" showcases a very annoying public perception. When a person or entity has the label "leader", it is implied that they are guides, who take away part of your freedom in exchange for knowledge or wisdom. This subtly removes the responsibility of the individual to form their own opinions and atrophies the capacity of the brain to critically assess information[^4]. Don't you dare tell me what I should be thinking about. Regurgitators are at best equivalent to a court jesters. Everyone listens to them because they are there, but unfortunately, people forget they are meant to entertain and offer zero insight.
+
+[^1]: This means that they originally provided value, but are now using their fame and public goodwill to propagate their ideas.
+
+[^2]: There are also second, third, ..., n-th tier of regurgitation, but these are just (n-1) first order regurgitations; the regurgitators just chew, swallow, and then spew back even less useful information.
+
+[^3]: "urm ☝️🤓 ackhtually, by expecting the reader to believe you, you are presenting yourself as an authority. Is it not the case that this is also a logical fallacy?" [Yes it is](https://keimena.aethrvmn.gr/books/metafysika/on-logic/fallacies). Will you trust the person telling you to trust them, or the person who tells you to trust yourself?
+
+[^4]: I am aware this has been a problem throughout the years (the term "influencer" has the same connotations).
Directive (EU) 2019/790, Article 4(3); all rights regarding Text and Data Mining (TDM) are reserved.